Network News

X My Profile
View More Activity

The plural of anecdote

Steve Pearlstein:

I should acknowledge that I have no idea who should and should not get routine mammograms. But I know enough about statistics to say that the issue is not settled just because you know of someone in her 40s whose breast cancer was detected by a mammogram and cured. As economists and medical researchers are fond of saying, the plural of anecdote is data.

I'd always thought that the saying was "the plural of anecdote is not data." But a quick Google search turned up people using it both ways to say precisely the same thing. Anyone know the answer?

By Ezra Klein  |  November 20, 2009; 11:31 AM ET
 
Save & Share:  Send E-mail   Facebook   Twitter   Digg   Yahoo Buzz   Del.icio.us   StumbleUpon   Technorati   Google Buzz   Previous: Locked in a drill's world
Next: Why did the poor lose money in the new Senate bill?

Comments

The plural of anecodote is anecodotes. The singular of data is datum.

Posted by: ostap666 | November 20, 2009 11:45 AM | Report abuse

I'd always heard it as plural is data, but I think your version makes more sense. I vote we confirm your interpretation. When's the next English council meeting?

Posted by: nytexan | November 20, 2009 11:46 AM | Report abuse

The non-idiot version is "the plural of anecdote is not 'data.'"

Posted by: wankme | November 20, 2009 12:00 PM | Report abuse

The customary version as the plural of anecdote is not data - the idea being that a few one-off stories do not a statistically reliable data-set make. You need more than "a friend of mine was saved by ..." to show that something is a good idea.

On the other hand, data is made up of individual data points, sometimes known as anecdotes.

Posted by: fuse | November 20, 2009 12:05 PM | Report abuse

Seems that Pearlstein quoted it accurately, but that it originally had the _opposite_ thrust -- don't dismiss anecdotes, because, if accumulated, they constitute data. See:
. But this source acknowledges that the quote has morphed into its opposite, which is how you heard it, and how I also heard it. In that formulation, it _is_ a dismissal of anecdotes, saying that, even if you have more than one anecdote, that still doesn't rise to the level of data.

Since Pearlstein was using the quote to disparage anecdotes, he probably should have quoted it in its altered form.

Posted by: richardfriedman | November 20, 2009 12:10 PM | Report abuse

Here's an anecdote for you: people say "could care less" and "couldn't care less" to mean the same thing, even though the first of those two is completely incomprehensible. And no, I don't buy the explanation that they are being deliberately ironic. What they mean to say is that they care so little that they could not possible care any less than they already do, in other words the second phrase is the appropriate one.

I say that Pearlstein is in the wrong, here. His phrasing does not imply what he means to convey, which presumably is that personal anecdotes do not undermine overall conclusions of a large amount of data.

Posted by: Interceptor402 | November 20, 2009 12:11 PM | Report abuse

Second try at posting the URL that I tried to cite:

http://listserv.linguistlist.org/cgi-bin/wa?A2=ind0407a&L=ads-l&P=8874

If that doesn't work, combine the following:

http://listserv.linguistlist.
org/cgi-bin/wa?A2=
ind0407a&L=ads-l&P=8874

Posted by: richardfriedman | November 20, 2009 12:15 PM | Report abuse

This is like people saying "I could care less" to mean "I COULDN'T care less". Let's please stick to the "non-idiot" version, as it's called above.

Posted by: Chris_O | November 20, 2009 12:16 PM | Report abuse

I'd heard it as "not data", but see:

http://bearcastle.com/blog/?m=20050808

Posted by: arsyed | November 20, 2009 12:20 PM | Report abuse

Well I vote for "not data", in fact I have never been on an econoblog that uses "is data".

But we can go snark or we can go deep. Snark: 'anecodote' & 'anecodotes'. So deeply snarky that I don't know if the joke is by me or on me.

Deep: at some point anecdote DOES become data. If I sample 1014 random selected Americans their opinions become an actual data point +/- 4.5%. Thirty years ago I took a college class in Statistics for Non Majors and in the first week was shown the mathematical proof for statistical significance. And even after that it still seems like black magic. I mean what are the odds that one truly random sample of 1000 people of a larger population will return the same answer as a poll of the whole population to within 5%? Better than 95% of the time. And you can prove it on half a sheet of paper. Pretty freaky when you think about it.

But getting back to earth the original is "the plural of anecdote is not data" or else every blog comment thread would be determinative of something or the other. And God knows that is not true (even on an Atrios comment thread, there being two different definitions of "random")

Posted by: BruceWebb | November 20, 2009 12:24 PM | Report abuse

Since a chapter of a book I'm writing hinged on this, I researched it rather fully. There is no established source, but it is clear that the actual quote is "plural of anecdote is NOT data." Or, better, that "data is not the plural of anecdote." Most impressive source I can find is the Nobel Laureate George Stigler, but Frank Kotsonis and Roger Brinner are also cited.

Posted by: ASkornheiser | November 20, 2009 12:24 PM | Report abuse

what fuse said. data in the statistical sense is not just a collection of anecdotes, like the ubiquitous internet "polls" that are totally useless yet still used by organizations as respected as NPR...

Descriptive data in the statistical sense is data from selected subsets of particular targeted cohorts in a larger population that can be generalized to the population as a whole. Anecdotes are the opposite of this, self selected stories that may or may not reflect the experiences of the larger population.

To be an informed voter these days, everyone should take statistics 101.

Posted by: srw3 | November 20, 2009 12:27 PM | Report abuse

Sounds like one of those cases where opposite sayings mean the same thing. (e.g., "I couldn't care less" and "I could care less".)

Posted by: wagster | November 20, 2009 12:32 PM | Report abuse

The source I cited alleges an earlier date than the Stigler source -- it asserts a 1984 published source and a 1969-70 oral source.

Posted by: richardfriedman | November 20, 2009 12:36 PM | Report abuse

"not data"

If you were to tell a story such as "Once I flipped a coin ten times and got tails each time!", it does not mean that you would normally expect that outcome since the probability is 1 in 10,000. (and you might want to check the balance of that coin)

Likewise, just because you know a woman in her 40s that had breast cancer detected by a mammogram, does not mean that it would be worth the cost to give mammograms to the other 9,999 women. Well, unless mammograms were incredibly cheap, which I doubt. Perhaps a worthy compromise would be for high-risk women in their 40s such as smokers to get mammograms and leave the rest be.

Posted by: aawiegel | November 20, 2009 12:51 PM | Report abuse

I can see how "the plural of anecdote is data" could work -- my data trumps your anecdote because it is made up of many many anecdotes -- but that's really weak, to the point where it's fairly clearly trying to retroactively turn a mangled expression into sense.

"The plural of anecdote is NOT data" is both clear on its face and directly applicable to the sort of anecdote-driven debate that Pearlstein is trying to counter. Even if I hadn't heard this version countless times (while I've never heard Pearlstein's version until today) I would consider it to be the proper form.

Posted by: bjrubble | November 20, 2009 1:02 PM | Report abuse

I'm pretty sure the correct way is 'the plural of anecdote is NOT data' and the sarcastic way of saying it is 'the plural of anecdote IS data'.

Posted by: goinupnup | November 20, 2009 1:21 PM | Report abuse

*Sigh*. The WaPo letting its copy editors leave is becoming more obvious of a mistake every day.

Posted by: wiredog | November 20, 2009 1:30 PM | Report abuse

I second what goinupnup said. The saying is "the plural of anecdote is not data." A sarcastic twist, when you're commenting (as the author was) on the way people tend to confuse them, would be to say, "the plural of anecdote is data." Sarcasm, however, is a dangerous instrument. Most people miss it -- witness the reaction a few days ago to Mark Shield's obviously sarcastic reference to missing a "manly man" president.

Posted by: nolo93 | November 20, 2009 1:40 PM | Report abuse

The correct usage in real science realm would be "not data." I'm a real scientist and lots of stories are just that--lots of stories. Data means that you've used statistics (large enough and diverse sample size, appropriate statistical power), controlled for the variables you're not testing, use double blind control group, etc. Anecdotes are uncontrolled. Without controls its not data. Lots of "not data" is never suddenly going to equal data.

Anyone that uses the phrase the other way--that it IS data--is not a scientist, doesn't understand science, the scientific method and/or statistics. Sorry to be a data elitist!

Posted by: mawst95 | November 20, 2009 2:08 PM | Report abuse

I'd say BruceWebb's argument actually disproves his point. He suggests that "at some point anecdote DOES become data. If I sample 1014 random selected Americans their opinions become an actual data point +/- 4.5%."

But that's only true if he "samples" -- i.e., asks the correct question of -- 1014 "randomly selected" individuals.
Most of us don't ask every single person we meet the same question, nor do we meet randomly selected individuals. Last, but by no means least, we don't necessarily remember every response (especially the negative ones). Consequently, our individual anecdotes, even if we collect 1014 of them, still are not going to become data in any reliable or meaningful sense.

A somewhat related question: in the NYT Op-Ed today, it is claimed that in order to prevent one death from breast cancer by screening women between 40 and 50 years old, you would have to screen 1,900 of them, every year, for a decade. That, by itself, doesn't sound so unreasonable, but the kicker is you would generate over a 1,000 false positives by doing so. If that's an accurate summary of the data (non-anecdotal), it's a fairly impressive argument for why such screening should not necessarily be recommended.
Is that really the case?

Posted by: retr2327 | November 20, 2009 5:46 PM | Report abuse

Data is already plural and needs no further multiplication.

The plural of anecdote is "urban myth."

Posted by: pj_camp | November 20, 2009 9:41 PM | Report abuse

I could care less, but I believe the USPSTF is both correct and foolish for releasing these recommendations in the middle of a contentious health care debate.

Posted by: bmull | November 20, 2009 11:49 PM | Report abuse

The comments to this entry are closed.

 
 
RSS Feed
Subscribe to The Post

© 2010 The Washington Post Company