Network News

X My Profile
View More Activity
Lisa's Favorite Sites
Posted at 8:15 PM ET, 02/16/2011

Watson the IBM computer trounces humans in 'Jeopardy!' competition [Update of update]

By Lisa de Moraes

Alex Trebek, left, Ken Jennings and Brad Rutter with computer program Watson on "Jeopardy!" (Sony)


"I for one welcome our new computer overlords," "Jeopardy!" competitor Ken Jennings wrote resignedly on his screen as Watson the IBM computer thoroughly stomped on him and fellow super-geek Brad Rutter during Watson's final appearance on the syndicated game show.

Watson racked up a total of $77,147 during competition after wagering $17,973 that "Who is Bram Stoker?" was the correct question to the clue:

William Wilkinson's "An Account of the Principalities of Wallachia and Moldavia" inspired this author's most famous novel.

Jennings and Rutter got it right too. But when the dust settled at the end of Wednesday's competition, Jennings (total of $24,000) and Rutter (total of $21,600) were so many laps behind Watson's $77,147, it was like they were running in the next race.

In Wednesday's final match, Watson competed -- if you could call this walk-over a competition -- against Rutter and Jennings in a full game of "Jeopardy!" Winning Watson took home the $1 million grand prize. Actually, IBM donated it to two lucky charities.

Heading into its final night of stealing every scene on "Jeopardy!", Watson had pretty thoroughly whomped the two brainiacs.

After hitting both Daily Doubles on Tuesday's show -- the middle of a three-episode competition arc for Watson -- the IBM computer was leading with a commanding $35,734. Rutter's pot stood at $10,400. And Jennings had to be wishing he'd never agreed to participate in this man-vs-machine February sweep stunt cause he was holding an embarrassing $4,800.

Jennings holds the show's record for winning the most consecutive games (74) and Rutter is known for earning the most money in "Jeopardy!" history. Watson was not impressed.

That is not to say Tuesday's edition of "Jeopardy!" was lacking in drama. Quite the contrary. When the second night of play got to the "Final Jeopardy!" category -- U.S. Cities -- the situation was thus:

Jennings looked a delicate shade of green on the far left, with just $2,400 in his pot. Rutter, looking stricken on the far right, had $5,400 to his credit. In the middle: Watson, looking smug, had amassed $36,681.

The "clue," as show host Alex Trebek read: "Its largest airport is named for a World War II hero. Its second largest for a World War II battle."

Jennings bet his entire kitty that the question was, "What is Chicago?" Rutter wagered $5,000 that the question was "What is Chicago?" Apparently Rutter planned to console himself with $400 if he got it wrong.

Both men were correct! That put them both -- still hopelessly behind Watson.

Watson thought the question was: "What is Toronto?" On the other hand, Watson had only wagered $947.

Oh you sneak!" Trebek cooed.

Watson winked.

The $1 million prize was money well-spent for the syndicated game show's producers. On Monday, Watson had handed "Jeopardy!" it's best single-day rating in four years. The next day, the show broke that record, pulling in its biggest ratings in nearly six years.

More Watson coverage: Could this boost the stock price of IBM? and "Jeopardy!" champ Ken Jennings took your questions Tuesday at 11 a.m.

Here's Watson in action:

(Video courtesy TVSquad)

By Lisa de Moraes  | February 16, 2011; 8:15 PM ET
Categories:  TV News  
Save & Share:  Send E-mail   Facebook   Twitter   Digg   Yahoo Buzz   Del.icio.us   StumbleUpon   Technorati   Google Buzz   Previous: Lindsay Lohan: CBS confused when it said she would read Letterman's Top-10 list [updated with new video]
Next: Charlie Sheen heading back to "Two and a Half Men" at end of month

Comments

I see it as the humans are beating the computer.

Posted by: mj2007 | February 15, 2011 10:24 AM | Report abuse

Watson is only doing what the humans have told it to do. This is something people tend to forget when they talk about a computer's performance.

Joe H.
Stevensville, MD

Posted by: joeboe1 | February 15, 2011 5:41 PM | Report abuse

I beat Watson over at NY Times.

Posted by: TOMHERE | February 15, 2011 7:50 PM | Report abuse

I think they should have done this like the online Jeopardy Quiz I just took where you get 15 seconds or what ever to answer so we see what all three would answer. The buzzer mechanism seems to give a sizable advantage to the computer. When the IBM computer was in competition in chess – instant responses were not a factor. Putting that contest into a strict Jeopardy mold is not that enlightening in learning how much better the computers answers are compared to the humans!

Posted by: dlkauf | February 16, 2011 11:04 AM | Report abuse

Of course Watson will beat the humans. He's programmed to respond or "buzz in" faster than a human. IBM needs to reprogram him so that his response times are on equal footing with his human competitors. For all the human intelligence that was in the Jeopardy room last night, each with Dr. preceding their name, not one of them thought of it. I was bored after the first 5 minutes because the competition level was zero.

Posted by: sjcsando | February 16, 2011 1:42 PM | Report abuse

I'm just glad that a TV program glorifying intelligence and learning is doing so well. As opposed the steaming pile of stuff on a good many other shows.

Posted by: Nosy_Parker | February 16, 2011 2:29 PM | Report abuse

@sjcsando: Your observation and assessment are spot on. (I made a similar comment on yesterday's blog.) But I would venture a different thought as to why Watson has been so calibrated. It is clear from watching the entire media blitz surrounding this event that the purpose is less to showcase a fair contest between Watson and Messrs. Jennings and Rutter than it is to promote Watson, the AI community, and most of all IBM. I cannot help but believe that many, if not all, of those "Drs. in the room" came to the same conclusion you (and I and others on these boards) did... and were overruled. It is easy to see how the key point of the publicity exercise, from IBM's corporate perspective, would not be scrupulousness about having a truly level playing field, but rather maximization of Watson's performance.

(All of that being said, I am still perplexed as to how Watson could have erred the way it did on the Final Jeopardy answer. One would think that the computer would have been able to instantaneously rule out any potential response that was not entered into its data bank as a US city. Was Watson not furnished with the Final Jeopardy category title? Or is this failure to rule out "bad answers" a major programming deficiency?)

Posted by: nan_lynn | February 16, 2011 3:39 PM | Report abuse

@sjcsando,nan_lynn: there's an important point about Watson's buzzing that makes the playing field more level than it looks. You can see the human players using the strategy of "buzz first, figure out the question later." (Look for the ones where they speak ... kinda ... slowly ... after Alex calls on them.) Ken Jennings himself acknowledged this in his Washington Post Live interview, http://live.washingtonpost.com/jeopardy-ken-jennings.html#question-28 and said it's a standard human Jeopardy! strategy.

Watson never does this, and buzzes only when completely ready with an answer, be it right or wrong. That means all those times when Watson gets first to the buzz, it's because Watson has finished /all/ the work in less time than the human needs to buzz-and-pray. Sounds no less than fair to me.

Posted by: chap0 | February 16, 2011 5:15 PM | Report abuse

Thinking more about this, I'm not even sure how much the speed matters, except to the drama of the publicity competition. Watson seems to be a whole lot of parallelizable algorithms running on a bunch of stock Power750 boxes cabled together. Because of ways the algorithms interact (which I don't know) I would not expect Watson to run exactly half as fast on 45 Power750s instead of 90, nor exactly twice as fast on 180 of them, but I would still expect that Watson would seem pretty much the same, but slower, if some servers were removed from the cluster, and pretty much the same, but faster, if some were added ... which could probably be done in a matter of minutes, maybe without even rebooting.

Now, Watson at half speed would probably be stomped by Jennings and Rutter, and Watson at double speed would probably stomp them hard enough to make dull TV (except for the quirky flubbed answers), all without any difference to the essential coolness of what Watson is doing.

My guess is that the hardware and software choices determining Watson's speed were probably made with an eye to avoiding an overly one-sided contest.

Posted by: chap0 | February 16, 2011 6:13 PM | Report abuse

@chap0: If you don't mind, I will continue to respectfully disagree.

Your specific point is well taken with regard to the very small percentage of questions for which Watson's database is incapable of producing a highly probable correct response. But, as we have seen (in particular if you have seen the PBS NOVA documentary), Watson has been fed enormous amounts of data and "trained" in such a way as to render such instances very infrequent. More to the point here, the human "buzz and pray" technique is not nearly as effective - or perhaps a better word is foolproof - as you suggest against a supercomputer. With all due respect to Mr. Jennings.

The critical issue is not so much being able to pull the answer out of one's head after a "prayerful buzz"; it is precisely *when* to buzz. Milliseconds are crucial! The "OK to ring in" signal is not reliably linked to when Mr. Trebek stops speaking. For human contestants, it is triggered by an electronic signal (lights on the perimeter of the game board), which is activated by a JEOPARDY! staff member after Mr. Trebek finishes reading the question. This human component introduces a tricky variable. If one anticipates and rings in too fast, one is locked out; too slowly, and the chance to answer goes to a competitor. In this case a competitor which is programmed to react (i.e., buzz) more quickly than any human possibly can, and which, if it does buzz, has an overwhelming probability of answering correctly.

In short, attempting to beat one's competitors to the signal ("buzz and pray") is, as the televised contest to date has shown, a strategem not nearly as effective against this machine as against other humans. Watson has two huge advantages: not having to process the "OK to ring in" signal with human eyes and thumbs; and never being at risk of buzzing anticipatorily. Those milliseconds make all the difference.

Two final FYIs:

1. In response to the question raised by some other posters, the contest clues and categories have been specifically designed to eliminate items that Watson is just not capable of understanding successfully, i.e., those involving certain kinds of word play and/or inferences. (That kind of leveling of the playing field doesn't really bother me, because it has been transparently acknowledged.)

2. With thanks to 1995hoo, who posted this on another thread:

"IBM has something online that helps explain the Final Jeopardy issue:

http://asmarterplanet.com/blog/2011/02/watson-on-jeopardy-day-two-the-confusion-over-an-airport-clue.html

The gist of it is that because the category titles are often potentially misleading or vague, Watson is programmed not to pay a lot of attention to them. Obviously, in this situation, that came back to bite him."

Thanks for the chat; off to watch the final installment. Cheers!

Posted by: nan_lynn | February 16, 2011 6:53 PM | Report abuse

One last observation from me: As some posters posited (if not here, then on the article 1995hoo linked), it appears from tonight's game that the expert humans were able to reliably beat Watson to the buzz on certain types of questions, to wit those that: (1) required synthesizing more than one type of information; (2) took very little time to read; or (3) both. This validates the time advantage theory. The longer it took to read a question aloud, the more time Watson had to search its database and come up with the correct answer to a high degree of confidence. In those instances, Watson rarely if ever lost the buzzer race.

Posted by: nan_lynn | February 16, 2011 8:20 PM | Report abuse

Why does this article say the bets were made after the "answers' were known, when Jeopardy requires bets based only the category?

Posted by: vmax02rider | February 16, 2011 9:04 PM | Report abuse

Love all your comments and want to ask - when Watson wagered he usually stuck to lower wages until tonight. Is that because as a computer it calculates the odds of winning are not that great as opposed to us mere mortals who play the lottery?

Posted by: gladiatorgal | February 16, 2011 9:56 PM | Report abuse

Ken should have "accidentally" poured coffee in Watson.

Posted by: LarryinMD | February 16, 2011 10:07 PM | Report abuse

People.

This shows how easy the Jeopardy! questions are in the modern era.

If you had some of the College Bowl questions of the 1950s in the mix, Watson would have no chance.

College Bowl had some questions that required more than one answer ... such as:

"NAMES'S THE SAME: If a 20th century pianist were playing a video game, what is going on?"

"Vladimir de Pachmann is playing Pac-Man."

Watson would have no chance in coming up with this answer because of the dual nature of the query.

Posted by: bs2004 | February 16, 2011 10:09 PM | Report abuse

Let's all simmer down, now.

Posted by: Aerowaz | February 16, 2011 10:18 PM | Report abuse

Like most technologies, AI is running about 20 years behind schedule in terms of living up to its promise, but maybe it is finally about to to have an impact.

Now let's see, what was AI supposed to be able to do? Oh yeah, write software. Uh-Oh.

And that's not far-fetched because there are speed programming contests where teen nerds solve graduate school level algorithms....much like chess whizzes used to win tournaments. Of course, those kids can be beaten by computers. Uh-oh!

Again, I'd say this looks grim for programmers.

Posted by: BurfordHolly | February 16, 2011 10:32 PM | Report abuse

All I can say is I love all the hype surrounding the show before and particularly after. Well done Ken, Brad and of course, Watson. That was neat!

Posted by: gladiatorgal | February 16, 2011 10:37 PM | Report abuse

For those of us "of a certain age" who can remember Neill Armstrong's giant step for mankind in 1969 this was certainly not as great a drama. But I suspect the future results of "Watson" will bring more direct
and lasting benefits to humanity than the "Space Race" of the Cold War. And the humans who lost-- God bless 'em-- conducted themselves with a grace and humor that I suspect no "artificial intelligence" will ever understand.

Posted by: dab12647 | February 16, 2011 10:48 PM | Report abuse

I was bored to tears with the whole IBM infomercial. It was almost as bad as celebrity Jeopardy.

Posted by: MTClarity | February 16, 2011 11:34 PM | Report abuse

The IBM informercial was a sham. Even Watson knows this. "Jeopardy" should be embarassed to have been so easily bought thus putting its reputation in jeopardy.

Posted by: uconnjak | February 17, 2011 12:58 AM | Report abuse

The machine plays at that level 24 hours a day, but the game is only 23 minutes total.

Posted by: blasmaic | February 17, 2011 3:03 AM | Report abuse

My guess is that it got confused in its metadata search that Toronto is in Ontario and there is an airport in California called Ontario. The logic engine probabbly didn't compute that they were two different containers.

The most important thing to take away from this is not the that the comuter was "smarter" than the humans, but rather its language processing engine could understand jeopardy clues. This stunt was intended to test Watson's ability to dechipher complex language that uses slang and metaphors.

The one thing I would like to know is that I think you have to wager after you get the category but before you get the clue in final jeopardy, so how did it compute to only wager $947

Posted by: akmzrazor | February 17, 2011 7:03 AM | Report abuse

competition,jeopardy.(www.twitter.com/hlmelsaid)

Posted by: hlmelsaid761 | February 17, 2011 8:21 AM | Report abuse

Based on many comments here it is clear that most viewers have no understanding of what it took to create Watson or the technology used. You should all view this week's NOVA to learn about the significant advancement in technology that was achieved to create Watson.

Posted by: cmpgm | February 17, 2011 9:16 AM | Report abuse

Watson is a breakthrough in Artificial Intelligence in the field of natural language processing. Basically, for it to understand a question is amazing, given all the little nuances of Jeopardy questions. Then, to get an answer and buzz in before Ken Jennings and Brad Rutter can get an answer in their heads is phenominal.

This is actually the upgraded model from IBM's Deep Blue back in the day, which beat Kasparov in '97. Now, even the most basic chess programs (like the one on my computer) can play at an almost 2200-2400 level. It's amazing what IBM has done.

Posted by: vk5u | February 17, 2011 9:45 AM | Report abuse

You could see the other contestants trying to buzz in because they knew the answers, but they couldn't buzz as fast as Watson. The computer isn't "smarter", it's just faster. Still, a very impressive feat of artificial intelligence.

Posted by: randygrenier | February 17, 2011 9:48 AM | Report abuse

The biggest problem for me was that they displayed Watson's answers/potential answers on the screen. This ruined the whole play-along-at-home aspect.

Viewers did not get to vicariously play against Watson themselves and instead were left passively watching two humans rake in a half-million dollars in combined second and third place prizes for being trounced.

Did anyone notice that Watson got credit for a couple answers that were not technically correct--for instance, answering "Maxwell's Silver Hammer" when the correct answer should have been "Maxwell"? Rip-off!

Also, I was a little embarassed for him when he repeatedly failed to correctly pronounce "Etude Brute".

Posted by: writinron | February 17, 2011 9:57 AM | Report abuse

That NOVA program was presented as having to do with development of artificial intelligence. Isn't it a mistake to confuse the ability to retain and retrieve a huge number of trivia with intelligence?

Posted by: svato | February 17, 2011 10:08 AM | Report abuse

Disclosure: I am an IBM employee and I have discussed these points with two of Watson's creators.

>

Posted by: nan_lynn | February 16, 2011 6:53 PM | Report abuse

This is a key point. Ken and Brad clearly looked frustrated because Watson often beat them to the buzz. At the same time, Watson will not buzz unless it is confident it can answer correctly. It takes some time for it to arrive at its best answer. Neither Brad nor Ken felt Watson had an unfair advantage, and they've both had a lot of success beating other Jeopardy players to the buzz. Also, great Jeopardy players like Brad and Ken don't wait for the light. They time their buzz based on the cadence of Alex Trebek's reading of the question. Watson will only buzz after the light comes on, but it is fast and consistent.

Some categories, for example the Actors Who Direct, were questions that could be read quickly. That gave Brad and Ken an advantage because they could come up with the answers quicker than Watson.

and lasting benefits to humanity than the "Space Race" of the Cold War. And the humans who lost-- God bless 'em-- conducted themselves with a grace and humor that I suspect no "artificial intelligence" will ever understand.>>

Posted by: dab12647 | February 16, 2011 10:48 PM | Report abuse

Well put. Thanks.

College Bowl had some questions that required more than one answer ... such as:

"NAMES'S THE SAME: If a 20th century pianist were playing a video game, what is going on?"

"Vladimir de Pachmann is playing Pac-Man."

Watson would have no chance in coming up with this answer because of the dual nature of the query.>>

Posted by: bs2004 | February 16, 2011 10:09 PM | Report abuse

Watson is capable of answering questions that require "more than one hop" to deduce an answer, but you are correct. A question like your example would be *much* more challenging for Watson, giving humans a clear advantage.

To those who disliked the show because it seemed like an IBM infomercial, I agree it was certainly good PR for us. But IBM and the Jeopardy producers agreed that if we didn't explain more about how Watson worked people wouldn't really understand what was going on. Sorry if that put some of you off. On the other hand, if Watson interests you, I recommend watching the Nova episode about it.

Posted by: gophercrow | February 17, 2011 10:31 AM | Report abuse

I stand corrected - Ken Jennings' op ed piece in the NY Daily News makes it clear that he did think that buzzer speed was an advantage for Watson, though he goes on to describe the same example of the Actors Who Direct category where he and Brad were able to answer more quickly.

http://www.nydailynews.com/opinions/2011/02/17/2011-02-17_ken_jennings_exclusive_oped_jeopardy_champ_says_computer_nemesis_watson_had_unfa.html?page=1

Posted by: gophercrow | February 17, 2011 10:51 AM | Report abuse

That NOVA program was presented as having to do with development of artificial intelligence. Isn't it a mistake to confuse the ability to retain and retrieve a huge number of trivia with intelligence?

Posted by: svato | February 17, 2011 10:08 AM | Report abuse

The key point about Watson isn't that it stores and retrieves information. It's that Watson can quickly look through a huge amount of unstructured information (not a database, but plain text), and find the connections it needs to answer specific questions with a high degree of accuracy.

A good comparison test: the next time you watch Jeopardy!, type in an entire "answer" into your favorite search engine - don't do any analysis of the question on your own, type it with no changes - then go to the first web page that comes back. Is the correct response anywhere on that page? You'll find that's unlikely, and even if it is you have to be able to pick it out. Watson isn't just doing a search. It's answering questions.

Posted by: gophercrow | February 17, 2011 10:58 AM | Report abuse

I thought the whole thing was kind of bogus - the experiment did not show the computer was really better at anything except buzzing in first. Which yeah its a computer it should be able to do that pretty quick once the ? is asked. Its not liek the 2 dudes didn;t know the answers, they just couldn't buzz in.

Posted by: ballgame | February 17, 2011 11:22 AM | Report abuse

Does Watson know it exists? And if not, when will it?

Posted by: StevefromCOBOL | February 17, 2011 11:39 AM | Report abuse

Does Watson know it exists? And if not, when will it?

Posted by: StevefromCOBOL | February 17, 2011 11:39 AM | Report abuse

Time out!

It seems like (nearly) everyone thinks that the hardest part of artificial intelligence is coming up with the right reply. Well, that IS hard, but it is (maybe not quite) trivial compared to getting a computer to understand what the CLUE means. The clues contained puns, slang, and abbreviations, things even human contestants mess up on sometimes, like when somebody gives the name of a book when the clue was asking for the name of the author.

My hat's off to the folks who worked so hard and cleverly to get a bunch of wires and melted sand to understand the tricky clues and THEN to figure out the right reply. It's a "people win" all the way. When your pig wins the blue ribbon at the county fair it's YOU who gets the credit and takes the ribbon home.

NOTE to nan_lynn who said

"... the contest clues and categories have been specifically designed to eliminate items that Watson is just not capable of understanding successfully, i.e., those involving certain kinds of word play and/or inferences."

Not really; the only categories that were excluded were the audio and video categories in which the contestant has to hear and recognize music or to see and understand a picture.

Posted by: TruthTold1 | February 17, 2011 12:02 PM | Report abuse

I understand that Watson has decided to change it's name to Colossus.

Posted by: Ghak | February 17, 2011 1:18 PM | Report abuse

I watched part of the charade. Watson or any computer will beat any human with straight on facts. It is when it is trying to tie "loose" facts together,ie, think, it is still a loser to humanity. Forget getting Toronto as the final Jeopardy answer wrong. It had to take two disparate facts and tie them together. I got Chicago only after hitting Midway as the battle and not knowing who O'hare was. But the cities are limited when you have two major airports. I knew NYC didn't count and LA was removed so that pretty much left Chicago. I will wager that in it's knowledge banks, Watson has the fact that O'hare was named after a WWII hero and that Midway was a WWII battle but it has no basis, ie, thought processes to put the two together for the proper answer.

No matter how much Watson's AI is touted, he don't got it, yet. Most likely because we do not understand the processes that allows us to connect these seemingly disparate facts into a cognitive answer. Watson, at the moment, is a fancy trivia machine and not a thinker.At least the Jeopardy players are trivia experts that can think.

Posted by: optodoc | February 17, 2011 1:21 PM | Report abuse

Oh, God, not even more product placement.

Posted by: SarahBB | February 17, 2011 2:24 PM | Report abuse

"I understand that Watson has decided to change it's name to Colossus"

I'm sorry Dave, I'm afraid I can't do that.

Posted by: BEEPEE | February 17, 2011 3:00 PM | Report abuse

Post a Comment

We encourage users to analyze, comment on and even challenge washingtonpost.com's articles, blogs, reviews and multimedia features.

User reviews and comments that include profanity or personal attacks or other inappropriate comments or material will be removed from the site. Additionally, entries that are unsigned or contain "signatures" by someone other than the actual author will be removed. Finally, we will take steps to block users who violate any of our posting standards, terms of use or privacy policies or any other policies governing this site. Please review the full rules governing commentaries and discussions.




characters remaining

 
 
RSS Feed
Subscribe to The Post

© 2011 The Washington Post Company