After the first day I would say that there is a definite advantage to the machine with the button pushing. On many questions where any Jeopardy! player would know the answer without hesitation Watson wins the race to push the button. So they are demonstrating a mechanical advantage. That is not much of technological breakthrough.
There have also been some interesting flubs on the part of the programmers.
First, the logistical problem. Watson gets information by text and electronic signal on when to answer, and apparently hasn’t been designed to interpret speech. In one case it gave a wrong answer after that same wrong answer had been given. It simply didn’t know that that had happened. It did know that it could try to answer – probably part of the whole signal button activation message – but it didn’t know that it was giving the same wrong answer. That one’s not a big deal. They focused on language interpretation which makes this a problem, we’ll see how serious, but it falls outside of the stated goals. It’s also impractical to surmount because it would require someone texting the response given by other contestants to Watson, and that wasn’t part of the plan.
The other issues are more programming problems. On more than one occasion Watson gave responses that were obviously wrong because it did not evaluate the entire answer. The response Watson gave seemed to be based on analysis of the first part of the answer that ignored the end. That’s either because of a prioritization based on length or punctuation, or it’s a deliberate shortcut. Either way it causes problems.
The two that I can think of demonstrated that Watson ignored or did not recognize crucial words in the answer. One was a question about trains where the response was ‘finis’ when the correct response was ‘terminal’. Watson evaluated the language reference in the answer but missed the association with trains. I don’t know if that, and the other case that escapes me at the moment, indicate a slip in textual analysis or a bug in the instructions.
I think I noticed one strategy in that Watson selects categories that have not been chosen yet when it has the chance. It seems to need a sample answer to evaluate the category. That’s not unlike human contestants, but I don’t remember ever seeing a human deliberately bounce around the board to do it, though.
So, there have been some interesting things, but I think Watson’s lead can be attributed to speed on the button rather than an advantage in ‘skill’. The graphic showing the possible responses and their weighting that Watson has picked is very interesting.
The fact that Watson knows Sauron but not Voldemort I will attribute to the age and preferred reading of the programmers – though I could be wrong. Maybe it was just the wording. After all, I know both of those.
No comments:
Post a Comment