Network News

X My Profile
View More Activity

Posted at 5:00 AM ET, 01/ 7/2011

What 2010 education research really shows about reform

By Valerie Strauss

The following was written by Matthew Di Carlo, senior fellow at the non-profit Albert Shanker Institute, located in Washington, D.C. This post originally appeared on the institute’s blog.

By Matthew Di Carlo
"Race to the Top" and Waiting for Superman made 2010 a banner year for the market-based education reforms that dominate our national discourse. By contrast, a look at the “year in research” presents a rather different picture for the three pillars of this paradigm: merit pay, charter schools, and using value-added estimates in high-stakes decisions.

There will always be exceptions (especially given the sheer volume of reports generated by think tanks, academics, and other players), and one year does not a body of research make. But a quick review of high-quality studies from independent, reputable researchers shows that 2010 was not a particularly good year for these policies.

First and perhaps foremost, the first and best experimental evaluation of teacher merit pay (by the National Center on Performance Incentives) found that teachers eligible for bonuses did not increase their students’ tests scores more than those not eligible. Earlier in the year, a Mathematica study of Chicago’s TAP program (which includes data on the first two of the program’s three years) reached the same conclusion.

The almost universal reaction from the market-based reformers was that merit pay is not supposed to generate short-term increases in test scores, but rather to improve the quality of applicants to the profession and their subsequent retention.

This viewpoint, while reasonable on the surface, not only implies that merit pay is a leap of faith, one that will likely never have its benefits “proven” with any degree of rigor, but also that the case against teacher experience and education (criticized by some of the very same people for their weak association with short-term student test score gains) must be reassessed.

On that note, a 2010 working paper showed that previous studies may have underestimated the returns to teacher experience, and that teachers’ value-added scores may improve for 10 or more years.

In the area of charter schools, Mathematica researchers also released an experimental evaluation showing no test score benefits of charter middle schools.

This directly followed a preliminary report on KIPP schools (also from Mathematica), which showed positive gains. The latter was widely touted, while the former was largely ignored. The conflicting results begged for a deeper discussion of the specific policies and practices that differentiate KIPP from the vast majority of charters (also see this single-school study on KIPP from this year), which produce results that are no better or worse than comparable public schools.

On this topic, an article published in the American Journal of Education not only found no achievement advantage for charters, but also that a measure of “innovation” was actually negatively associated with score gains. The aforementioned study of charter middle schools likewise found few positive correlations between school policies and achievement.

As a result, explanations of why a few charters seem to do well remain elusive (I proposed school time as the primary mechanism in the case of KIPP).

So, some of the best work ever on charters and merit pay came in 2010, with very lackluster results, just as a massive wave of publicity and funding awarded these policy measures a starring role in our national education policy.

This past year was also bountiful for value-added research. Strangely, the value-added analysis that got the most attention by far – and which became the basis for a series of Los Angeles Times stories – was also among the least consequential. The results were very much in line with more than a decade of studies on teacher effects. The questionable decision to publish teachers’ names and scores, on the other hand, garnered incredible public controversy.

From a purely research perspective, other studies were far more important. Perhaps most notably in the area of practical implications, a simulation by Mathematica researchers (published by the Education Department) showed high “Type I and Type II” error rates (classification errors that occur even when estimates are statistically significant), which persisted even with multiple years of data. On a similar note, a look at teacher value-added scores in New York City – the largest district in the nation – found strikingly large error margins.

The news wasn’t all bad, of course, and value-added research almost never lends itself to simple “yes/no verdicts.” For instance, the recently released preliminary results from the Gates-funded MET study provide new evidence that alternative measures of teacher quality, most notably student perceptions of their effectiveness, maintain modest but significant correlations with value-added scores (similarly, another 2010 paper found an association between principal evaluations and value-added scores, and also demonstrated that principals may use this information productively).

In contrast to absurd mass media claims that these preliminary MET results validate the use of value-added scores in high-stakes decisions, the first round of findings represents the beginning of a foundation for building composite measures of teacher effectiveness. The final report of this effort (scheduled for release this fall) will be of greater consequence.

In the meantime, a couple of studies this year (here and here) provided some evidence that certain teacher instructional practices are associated with better student achievement results (a major focus of the MET project).

There were also some very interesting teacher quality papers that didn’t get much public attention, all of which suggest that our understanding of teacher effects on test scores is still very much a work in progress. There are too many to list, but one particularly clever and significant working paper (from NBER) found that the “match quality” between teachers and schools explains about one-quarter of the variation in teacher effects (i.e., teachers would get different value-added scores in different schools).

A related, important paper (from CALDER researchers) found that teachers in high-poverty schools get lower value-added scores than those in more affluent schools, but that the differences are small and do not arise among the top teachers (and cannot be attributed to higher attrition in poorer schools). The researchers also found that the effects of experience are less consistent in higher-poverty schools, which may explain the discrepancies by school poverty.

These contextual variations in value-added estimates carry substantial implications for the use of these estimates in high-stakes decisions (also see this article, published in 2010). They also show how we’re just beginning to address some of the most important questions about these measures’ use in actual decisions.

In the longer-term, though, the primary contribution of the value-added literature has been to show that teachers vary widely in their effect on student test scores, and that most of the variation is unexplained by conventional variables. These findings remain well-established. But whether or not we can use value added to identify persistently high- and low-performers is still very much an open question.

Nevertheless, 2010 saw hundreds of states and districts move ahead with incorporating heavily-weighted value-added measures into their evaluation systems. The reports above (and many others) sparked important debates about the imprecision of all types of teacher quality measures, and how to account for this error while building new, more useful evaluation systems. The Race to the Top-fueled rush to design these new systems might have benefited from this discussion, and from more analysis to guide it.

Overall, while 2010 will certainly be remembered as a watershed year for market-based reforms, this wave of urgency and policy changes unfolded concurrently with a steady flow of solid research suggesting that extreme caution, not haste, is in order.


Follow my blog every day by bookmarking And for admissions advice, college news and links to campus papers, please check out our Higher Education page at Bookmark it!

By Valerie Strauss  | January 7, 2011; 5:00 AM ET
Categories:  Charter schools, Guest Bloggers, Matthew Di Carlo, Research, Teacher assessment  | Tags:  charter schools, education research, market-based reforms, shankar blog, value-added, value-added measures  
Save & Share:  Send E-mail   Facebook   Twitter   Digg   Yahoo Buzz   StumbleUpon   Technorati   Google Buzz   Previous: NCLB's 9th anniversary: 'Will there be anything we will need to remember after the test?'
Next: What really helped Florida's test scores


I recently blogged about the confusing disconnect between what I see happening in my classroom and what my evaluators perceive at Despite steering the vast majority of my students well beyond AYP in consecutive years, my teacher "scores" reveal a portrait I do not recognize. While I understand the desire to find a magical equation that yields a definitive measure of teacher effectiveness, I believe the effort will prove as futile as determining precisely why one principal motivates and another demoralizes.

Posted by: dcproud1 | January 7, 2011 5:48 AM | Report abuse

Outstanding, lucid, and fact-based. . .Let's hope 2011 can be a year when expertise and evidence impacts education reform and education discourse. . .

Posted by: plthomas3 | January 7, 2011 7:45 AM | Report abuse

If The Hawthorne studies taught me anything, and realize I did not peel back every layer, it was that when watched people behave differently; children included. I know when my sons were younger they behaved differently when I was watching as opposed to when I walked away.

The idea that people are focusing on charter schools, KIPP, or others leads me to believe the reason it tapers off over time comes from taking the attention away. The focus, novelty, or (fill in the blank) starts to evaporate and the students move with the attention.

I agree this is ongoing. It will not be fixed tomorrow. It cannot be fixed by academic innoculation. We need to be patient, we need to retain/maintain focus, and we need to let the kids KNOW we are watching!

Posted by: jbeeler | January 7, 2011 7:59 AM | Report abuse

I found the following portion of Di Carlo's article particularly interesting: "There are too many to list, but one particularly clever and significant working paper (from NBER) found that the “match quality” between teachers and schools explains about one-quarter of the variation in teacher effects (i.e., teachers would get different value-added scores in different schools)."

In my experience as a former teacher who taught in several different schools, I found my teaching performance to be highly affected by the school (and the students, the schedule of classes, administrative support, and many other variables). Depending on the context, I could be a great teacher in one setting, and a medicre or even poor teacher in another setting. So much depends on the school, the students, and other factors (e.g., teaching honors History in one school is not the same as teaching remedial English in another school).

Whenever I see reports of "teacher quality" and calls to "fire the bad teachers", I always wonder why no one is looking at the variables of school, students, course assignment, school discipline policies, and many other variables. Just like it would make sense that a great little league coach might not be the best manager of the Yankees, the best remedial math teacher not be very good at teaching AP Calculus (or visa versa). Let's not be too quick to label a teacher as "effective" or "ineffective" without examining the context in which they are teaching.

Posted by: AttorneyDC | January 7, 2011 9:57 AM | Report abuse

AttorneyDC, can the church say, "Amen?"

What was 'effective' my first few years teaching as clearly not been 'effective' yet currently for the reasons you stated precisely. Some of us 'veterans' commented how it felt as if we were brand new teachers adjusting to our new environment.
Like all data, value-added should be used to investigate WHY those teachers are effective so as to replicate it, because there are so many factors that impact successful teaching.
In engineering Computational Fluid Dynamics is maddeningly complex simply because of the magnitude and variation of the inputs required to model fluid flow over a moving object. Teaching is similarly complex because of the same reasons: the magnitude and variation of the inputs that impact student success, inputs that are impossible to capture in one 'equation.'

Posted by: pdexiii | January 7, 2011 10:57 AM | Report abuse

attorneyDC - of course you're right, but that's much too nuanced for today's crop of reformers to accept. To them, teachers and students are just achievement machines.

Posted by: efavorite | January 7, 2011 12:41 PM | Report abuse

Well, I'm glad to see other commenters agree with me about the difficulty of evaluating or assessing teacher performance. Unfortunately, it seems that the concept is too complicated for many of the 'reformers' to understand (as efavorite noted above). Wish there was something we could do to make the public comprehend that teacher quality can't be accurately measured one value-added score, or other simplistic method.

Posted by: AttorneyDC | January 7, 2011 1:28 PM | Report abuse

attorneyDC - a way to start to figure out how to evaluate teachers might be to ask people what effects their teachers have had on them.

What did they appreciate about their favorite teachers and what didn't they appreciate about their least favorite teachers? Involve people of all ages - kids still in school and adults long out of school.

Posted by: efavorite | January 7, 2011 2:17 PM | Report abuse

Efavorite: Using value-added scores based on one test is not a very effective way to evaluate teachers. However, I'm not sure that using your idea of asking people to recollect the effect their teachers had on them would be very practical (although it would be interesting). I'm sure there would be teachers remembered fondly for their kindness or wit who didn't raise test scores as much as another teacher who gave dry, boring lectures.

Part of the issue of evaluation comes down to the question of what we want our teachers to accomplish with their students -- To teach them the most number of facts about a topic? To teach them good citizenship? to teach them to love learning and exploring the world around them? To teach them good behavior and comportment? All these things are important, but I doubt most of these goals correlate well with value added scores -- and value added scores themselves have been shown to be flawed, and only an approximation of a teacher's impact on student learning.

Posted by: AttorneyDC | January 7, 2011 3:53 PM | Report abuse

AttornyDC - the point of getting this info is not to use it to evaluate real teachers, but to learn what teachers are valued and remembered for. Myabe such a study has already been done. I haven't heard about it.

The thinking now seems to be that moving student achievement is all that counts and the only way of assessing it is through value-added scores.

Posted by: efavorite | January 7, 2011 4:24 PM | Report abuse

to efavorite ... in response to your comment:
"a way to start to figure out how to evaluate teachers might be to ask people what effects their teachers have had on them..."

I have countless memories of ways my teachers in public school inspired me, sparked my curiosity, clarified for me etc... so here are a few...

to highlight the meaning of a passage from Camus, Voltaire or Stendahl... Mr. B would charge forward reading the words and acting them out in French!

we were having a class discussion about "Of Mice and Men" when a student suddenly commented about a sibling who was mentally challenged and our teacher allowed a very dynamic classroom discussion to ensue for the rest of the class.

These examples above are just two immediate memories of many that contributed to my lifelong love of learning .... But do I remember the specifics of any tests I took during those years??? NO! I remember the teachers ... the people who passionately made education a journey. Teachers are being metaphorically shackled by micro-management of everything they do... a lot of intelligent professionals have their passion bridled by "reform"! Value added teacher evaluations are just unfortunate "icing on the cake"!

Posted by: teachermd | January 7, 2011 7:11 PM | Report abuse

Thanks, Teachermd. This particularly struck me:

"we were having a class discussion about "Of Mice and Men" when a student suddenly commented about a sibling who was mentally challenged and our teacher allowed a very dynamic classroom discussion to ensue for the rest of the class."

That sort of thing wouldn't be allowed under IMPACT - going off the lesson plan, not repeating the objective, not addressing different learning syles, nor correcting student behavior, etc., etc.

Posted by: efavorite | January 7, 2011 11:40 PM | Report abuse

I agree with efavorite and hear countless teachers (colleagues) say the very same thing. The students under this "cookie cutter" curriculum are not given a voice. The "learning" schedule is scripted on a calendar. So hmm ... if students haven't mastered something on this "rigorous schedule" it really doesn't matter... what matters is being on the "right learning objective" on the right day marked on the curricular calendar. Is this not absurd to treat students this way? Is there time for in depth discussions and bringing learning to life (through class conversations that make real life connections as in the "Of Mice and Men" example??? NO! The worst part is that it teaches students that there are right answers and wrong answers and that taking risks and failing is not a learning opportunity but is dangerous and can lead to teachers being fired and schools being closed. Risk taking was always encouraged in innovative and creative thinking/learning! IMPACT is a farce and allows great teachers to be labelled "unsatisfactory". When we attempt to STANDARDIZE things, in the end... we stand to lose exellence in favor of normed mediocracy.

Posted by: teachermd | January 8, 2011 10:50 AM | Report abuse

"There will always be exceptions (especially given the sheer volume of reports generated by think tanks, academics, and other players), and one year does not a body of research make."

Social policy research has historically proven contradictory. For every study that appears to "prove" one theory, there often appears two other studies that conclude to the opposite.

I like to think personally on each of these critical reforms.

Should a poor/minority urban parent be resigned to accept only the neighborhood school for their child? Or, should they be allowed a choice (charter by lottery) as to where to send their child to school, a choice previously afforded only to families of wealth?

Should teachers continue to all be labeled "satisfactory" (not their fault) when some never even get evaluated at all and others are in desperate need of professional development? Or should there be a degree of objectivity injected into the process, if for no other reason than to protect the integrity of the profession? And even if this process doesn't kick in as reliable for four or five years? Or maybe we could continue on the road to pretending all teachers are "satisfactory."

Should we allow drop out factories to continue to fail their constituents year after year after year? Or should these chronically failing schools be re-evaluated and either alter their faculties dramatically or close these schools altogether and reopen them with new administration and a significantly altered staff?

In each of these decisions I'd have to decide in favor of students, not exclusively to the benefit of the teachers/adults.

Posted by: phoss1 | January 9, 2011 8:12 AM | Report abuse

to Phoss1: you are buying into all the PR spin orchestrated by "edu business" strategists who seek to gain by promoting more testing and "privatization" of schools. The problem is that they completely (ironically too) ignore the real problems of the students they "want" to "save". It is poverty! I am waiting for the day when Bill Gates goes head to head with Diane Ravitch in a debate. Please watch this when it happens. I would also recommend that if you haven't already done so... read her book, "The Death and Life of the American Public School system". Listening to Bill Gates, Oprah Winfrey, Eli Broad etc.. discuss what works in education is like having me diagnose my car when there is a problem (I know something is wrong but have no knowledge of car mechanics).

How many of you out there reading this blog would like to see Diane Ravitch go head to head with Bill Gates in a discussion or debate about education? I cannot wait for this day to come. THE PUBLIC AT_LARGE HAS A RIGHT TO DECIDE FOR THEMSELVES WHAT THE PUBLIC EDUCATION REALITY IS AND NOT BE "LED TO VIEWPOINTS" DUE TO PR SPIN.

Posted by: teachermd | January 9, 2011 11:02 AM | Report abuse


As a retired Massachusetts public school teacher, I've read all of Diane Ravitch's books, including her latest. I have told her directly how disappointed I am in her Whittaker Chambers-like late-career change of thinking. Enigmatically, she's convinced herself she's the only one who knows anything about public education.

While she's entitled to her opinion, she's offered little in the way of pragmatic, concrete solutions. I still have a great deal of respect for her body of work but am now on the opposite side of the fence from her philosophically, as you can probably gather by my post above.

She's too preoccupied with her new found status of being the defender of public school teachers. When she gets awarded educator of the year by the notorious NEA (that's code for selling your soul to the devil), I know she's on the wrong side of the fence. I was forced to be a member of the NEA my entire career and was diametrically opposed to ninety percent of what they supported because the absolute last concern of their Representative Assembly (their governing body) was children.

Her newest book and her blog on Ed Week every Tuesday is laced with too many half truths. Her neo-line of convoluted reasoning is way off base. It's just sad. She has stopped telling the REST OF THE STORY.

As for a debate with Bill Gates, I'd like to see it as well. She makes Gates out to be the second coming of Bernie Madoff, or the devil. Fact: She and many from the educational establishment are flat out jealous that people like Gates just float in from left field and now have an apparent equal seat at the table as her. I think Gates would more than hold his own against her.

Posted by: phoss1 | January 9, 2011 4:29 PM | Report abuse

AttorneyDC -- your strategy is crystal clear: to declare teacher eval to be too difficult, and, voila, it will never arrive in any form you would accept.

Posted by: axolotl | January 9, 2011 9:05 PM | Report abuse

I do think that well done teacher evaluation is difficult. I don't think it's impossible, but I believe that the quick-fix methods like using "value added" scores or a single observation by a administrator (who may never have taught the subject he or she is evaluating) are flawed.

My problem with the current ed reform debate is that the focus is on blaming the teachers, when the real difference in student performance (on a statistical scale) is due to the students. High income students from educated parents are almost always going to do pretty well in school; low income students with parents who dropped out of high school are probably going to do poorly.

There are many cultural, motivational, and cognitive factors at work in student achievement, and the current trend to place all blame for student failure at the feet of teachers is misguided. I think focusing on student discipline, parent involvement, curriculum and other issues will result in better outcomes than lambasting teachers, most of whom are trying hard and, especially those teachers in the lowest performing schools, are struggling with many issues on a daily basis that most of the ed policy wonks can't even imagine. Has firing a bunch of teachers ever resulted in skyrocketing student performance?

Posted by: AttorneyDC | January 10, 2011 8:27 AM | Report abuse

Post a Comment

We encourage users to analyze, comment on and even challenge's articles, blogs, reviews and multimedia features.

User reviews and comments that include profanity or personal attacks or other inappropriate comments or material will be removed from the site. Additionally, entries that are unsigned or contain "signatures" by someone other than the actual author will be removed. Finally, we will take steps to block users who violate any of our posting standards, terms of use or privacy policies or any other policies governing this site. Please review the full rules governing commentaries and discussions.

characters remaining

RSS Feed
Subscribe to The Post

© 2011 The Washington Post Company