Network News

X My Profile
View More Activity

Posted at 9:00 AM ET, 11/17/2010

Buster Posey and the value-added teacher debate

By Jay Mathews

A new report on the controversy over rating teachers by how much their students' scores improve asks, sort of, this odd question: Should the San Francisco Giants keep rookie of the year Buster Posey on their team next year?

"What kind of a question is that!!??" Giant fans say. I used to be a Giants fan, but it's hard when my toddler grandson Ben, a Los Angeles resident, yells "Booooo!" every time that team is mentioned. Nonetheless, trading Posey would seem idiotic even to him.

But the Brown Center on Education Policy at the Brookings Institution, in its new report "Evaluating Teachers: The Important Role of Value-Added," suggests that Posey's admirable .305 batting average, in the view of those who warn against using valued-added data in judging teachers, does not seem so great.

It notes that "the correlation in test-based measures of teaching effectiveness between one school year and the next lies between .20 and .60 across multiple studies, with most estimates lying between .30 and .40. A measure that has a correlation of .35 from one year to the next produces seemingly troubling statistics in line with our conceptual discussion of classification errors."

Critics of value-added assessments have noted that only about a third of teachers ranked in the top quartile of value-added based on one academic year's performance would appear in the top quartile the next. Having set us up to think that .35 is not such a great correlation, the Brown Center paper's six authors (only one of whom, Susanna Loeb of Stanford, seems to live in Giants territory) slyly dump this on us:

“The between-season correlation in batting averages for professional baseball players is .36. [They cite a 2000 paper in The American Statistician, "Do baseball players regress to the mean?", which of course we all remember.] Ask any manager of a baseball team whether he considers a player's batting average from the previous year in decisions about the present year."

That shows how nasty this argument over value-added assessment has become. The L.A. Times created a furor when it published the value-added records of thousand of local teachers. School districts across the country are battling over the issue. The Brown Center paper pushes a stick into this red ant nest by noting that the critics of value-added don't mention how often we use other metrics with similar reliability in big decisions.

Picking college applicants based on SAT scores, selecting hospitals based on mortality rates, recruiting realtors based on home sales volume -- all involve correlations similar to those for picking a teacher based how much her students' achievement rates improve. The authors warn against setting "unrealistic expectations for the reliability or stability of value-added. Value-added evaluations are as reliable as those used for high-stakes decisions in many other fields."

If your goal is raising student test achievement, the paper says, "value-added is superior to other existing methods of classifying teachers."

It is a clever contribution to the debate, certain to make you howl or cheer, depending on your point of view. It will not, however, increase in any way the infinitesimal chance that the Giants will ever trade Posey to my grandson's beloved Dodgers.

By Jay Mathews  | November 17, 2010; 9:00 AM ET
Categories:  Jay on the Web  | Tags:  Buster Posey, batting averages have similar flaws, value-added criticized for mediocre reliability as a measure of teacher effectiveness year to year, value-added teacher assessment  
Save & Share:  Send E-mail   Facebook   Twitter   Digg   Yahoo Buzz   StumbleUpon   Technorati   Google Buzz   Previous: Veterans, struggling students need more college support
Next: Hiding exams from students


I hate sports analogies. Therefore I won't comment on the bulk of the article. But as a Realtor for over 30 years, I can honestly say I have NEVER encountered anyone who chose to use a Realtor based on "home sales volumes". That's simply bizarre. Most homebuyers end up with a particular Realtor because they spotted a for sale sign on a property or because of an advertisement for a property that meets their needs - OR because of a personal referral.

Posted by: lisamc31 | November 17, 2010 9:32 AM | Report abuse

To lisamc31-- Thanks for the helpful comment. the word I used was "recruiting" realtors. I realize now I should have used more words to explain the authors were talking about real estate firms recruiting more staff, and looking at each applicant's home sales volume in the same way that baseball managers look for big batting averages. I was taking their word for it. Is that a factor in hiring by firms?

Posted by: Jay Mathews | November 17, 2010 11:13 AM | Report abuse

The authors warn against setting "unrealistic expectations for the reliability or stability of value-added. Value-added evaluations are as reliable as those used for high-stakes decisions in many other fields."
Just because similar evaluations are used in making other decisions does not mean they are valid. Deciding which hospital to go to based on only mortality rate may mean you don't go to a hospital that tries new ideas in helping critically ill patients. And on the other hand, maybe a given hospital only treats minor injuries, and thus would have a low mortality rate.

If people want to rate teachers based mostly on value-added, then why would we not have high expectations on the reliability of value-added? If a school district is making decisions about whether or not I will be hired next year or what my salary would be, why wouldn't I want those decisions based on reliable and stable data?

Posted by: JackS2 | November 17, 2010 12:24 PM | Report abuse

It is actually a perfect analogy, because anyone that knows anything about baseball knows that Batting Average is a basically useless statistic.

A player who hit .300 with no power and no walks is SIGNIFICANTLY less valuable then one who hits .250, but walks a lot and hits a lot of home runs.

Measures of OPS which take those into account are a lot more stable.

So, comparing fairly useless "value-added" results to a fairly useless batting average statistic seems like a good fit.

Posted by: Wyrm1 | November 17, 2010 12:49 PM | Report abuse

If I HAD to rate teachers for "value addedness" based only on standardized test scores, I wouldn't use the average score for each class. Instead, I would use the median test score.

For several reasons that will not be explained here, the median of a large data set subject to large variations is a far better indicator of what is actually happening in the real world. Some real world examples of using the median instead of the average are the cost of housing and worker wages.

The median is far more useful than the average for most public policy discussions.

Posted by: fairfaxvaguy | November 17, 2010 1:31 PM | Report abuse

To elaborate a little bit... you have a 4th grade child and the choice of 2 teachers...

Tom is pretty surly, not particularly good at contact parents when they have concerns, arrives at 8 and leaves at 3:30 and doesn't really treat the kids that well. He doesn't believe in art, recess, music or anything except preparing for standardized tests, which the kids do all day every day. They do well on the standardized test but don't like learning.

Jack is pleasant, calls parents when there is a problem, stays late to work on projects with the kids when they want to, expands their minds by tying art, music and so on into the curriculum. The kids love him and learn a lot, but since he does not drill standardized tests, his scores are lower.

Which would you rather have teaching your child?

Tom gives you one measurable thing that he does really well. Jack gives your kids a much better overall educational experience, but does poorly on the only thing that "reformers" want to measure.

If you answered Tom, more power to you, but I suspect most would not. Reform is going to give you a lot more Toms and a lot less Jacks.

Posted by: Wyrm1 | November 17, 2010 2:19 PM | Report abuse


I have worked with Mom & Pop agencies as well a big corporations. Managers of the bigger firms like Coldwell Banker and Century 21 do get bonuses based on the number of new agents they recruit, but in my highly competitive market here in NY, turnover is high and any willing, eager warm body is usually brought into the fold by the office manager.

Real estate agents are considered self-employed, independent contractors. It is considered unethical to try to recruit someone who has their license with another agency.

Posted by: lisamc31 | November 17, 2010 2:29 PM | Report abuse

So true Wyrm1. There is a really dreadful teacher at our school and during the roll out of IMPACT her ME scores were so great she got special accolades from Rhee.

We were all scratching our heads since our kids were miserable in this woman's class and learned so little.

Now that I understand more about how flawed IMPACT is as an evaluation too, it all makes perfect sense.

Anyway, your example sums it up perfectly. Thanks.

Posted by: Title1SoccerMom | November 17, 2010 2:30 PM | Report abuse

terrific, thoughtful posts on a complicated subject. I particularly liked Wyrm1 on batting averages.

Posted by: Jay Mathews | November 17, 2010 2:55 PM | Report abuse

But remember- Value-added would be but one measure of teacher effectiveness... just like batting average is one measure of player effectiveness. As wyrm said- we have to consider walks, and batting average ect. Likewise, when we evaluate teachers we need to include observational data, professional development, and parent feedback. When you do this, you get a more stable number like OPS.

I suggest people read the report. It addresses many of the concerns in the posts found here.

Posted by: mmccabe4724 | November 17, 2010 7:28 PM | Report abuse

It simply amazes me that people act as if pay isn't based on performance in numerous other fields.

The good statistical point though is that the sample size for a single year is quite small, n =25 is a tough one.

Regardless, the point remains that simply basing pay on "years experience" and academic degrees is quite weak and impossible to defend for any field. It is likely a form of discrimination.

Fine grained data is tremendously useful. We have discovered that it is can be better to have a specialist math and reading teachers in kindergarten and first grade, dividing the children into ability groups works best for these subjects, etc. We have also discovered that letting students remain on a topic until they have mastered it is key.

Putting all of the pieces in to place for an adaptive learning system for students, where they emerge actually knowing how to do things is key. A deep problem is that are schools are "too easy", parents are "too easy" on the kids. Pretend acceleration doesn't work.

Posted by: staticvars | November 18, 2010 1:01 AM | Report abuse

Mortgage refinancing means re-funding the mortgage loan with better terms as well as conditions, most likely from a different lender. It is one way to save money. Search online for "123 Mortgage Refinance" they found me 3.1% refinance rate and also gave free analysis of my mortgage.

Posted by: davismiller123 | November 18, 2010 4:28 AM | Report abuse

Batting average? Really!?

Not the really advanced stuff for hitters, like VORP or WAR?

Not the mildly sophisticated stuff like OPS+?

Not the simply stuff like on base percentage or slugging, or even their sum (i.e. OPS).

We're going to say that VAA is ok because it is as good as the crappy baseball stat that people grew up through the 20th century? We're going to ignore the statistical revolution of the last 25 years that showed how bad that old stuff was, ignore the fact that that smart teams in baseball don't use BA to analyze players?


Honestly, I think that this is a perfect analogy. There are those who think that we should use the kind of statistics in education that people in baseball have long since realized is far too flawed to useful. That's the best argument they can make.

Posted by: ceolaf3 | November 18, 2010 7:39 PM | Report abuse

Jay, does this mean that you allow your grandson to claim the Dodgers, but you don't?

I've always appreciated your focus on broad statistical strokes; losing nuance is a risk, but so is losing focus. Teachers who are achieving dramatically better than their peers should be encouraged and emulated. All too often, they are at risk of being ignored or worse!

Posted by: jeffcherniss | November 21, 2010 2:40 AM | Report abuse

Post a Comment

We encourage users to analyze, comment on and even challenge's articles, blogs, reviews and multimedia features.

User reviews and comments that include profanity or personal attacks or other inappropriate comments or material will be removed from the site. Additionally, entries that are unsigned or contain "signatures" by someone other than the actual author will be removed. Finally, we will take steps to block users who violate any of our posting standards, terms of use or privacy policies or any other policies governing this site. Please review the full rules governing commentaries and discussions.

characters remaining

RSS Feed
Subscribe to The Post

© 2010 The Washington Post Company