I am a bit sceptical about whether or not it actually passed the Turing test. To me it looks more like a publicity stunts for the following reasons:
1) 5 minutes is a short period of time.
2) I don’t believe Turing mentioned anything about 30% . I might be wrong on this one.
3) I don’t know if the judges were properly trained. What questions did they ask? I feel like there must be plenty of questions related to IQ and creativity that a thirteen year old could answer with ease but that Eugene Goostman would struggle with. Examples: “Cow is to bull like, bitch is to ….?”, or “Once upon a time there lived a pink unicorn in a big mushroom house with three invisible potatoes. Could you finish the story for me in a creative way and explain why the unicorn ended up painting the potatoes pink?” . The idea with the Turing test is that the computer should be indistinguishable from a human (in this case a 13 year old non native english speaker). I don’t believe this criteria has been met until I see a chat transcript with reasonably hard questions.
4) Having the bot pose as a none native English speaking 13 year old might not be a violation of the rules, but I very much feel like it goes against the spirit of the Turing test. It reminds me a bit of this comic (http://existentialcomics.com/comic/15). But this is beside the point, I don’t even think the bot would pass the Ukrainian-13-year-old-boy-turing-test if it was asked reasonably hard questions.
Until I learn more about the proceedings I remain utterly unconvinced that this is the milestone in AI media portrait it to be. It is nonetheless pretty cool!
“Once upon a time there lived a pink unicorn in a big mushroom house with three invisible potatoes. Could you finish the story for me in a creative way and explain why the unicorn ended up painting the potatoes pink?”
Well obviously, the unicorn did it to satisfy the ghost of Carl Sagan, who showed up at the unicorn’s house and started insisting that the potatoes weren’t there. Annoyed, she tried throwing flour on the potatoes to convince him, but it turned out the potatoes really were permeable to flour. It was touch and go for a while there, and even the unicorn started to doubt the existence of her invisible potatoes (to say nothing of her invisible garden and invisible scarecrow—but that at least had done an excellent job of keeping the invisible birds away). Eventually, though, it was found that pink paint coated the potatoes just fine, and so Carl happily went back to his post co-haunting the Pioneer 10 probe. The whole affair turned out to be a boon for the unicorn, as the pink paint put a stop to a previously unfalsifiable dragon, who had been eating her potatoes (or so she suspected—she had never been able to prove it). The dragon, for his part, simply went back to his old habit of terrorizing philosopher’s thought experiments.
The test was in fact as Turing specified. In addition to 30% being the challenge, as Stuart pointed out, Turing specified 5 minutes and an “average interrogator”.
The more interesting point here, I think, is the discovery (not very surprising by now) that a program that can pass the true Turing Test is still narrow AI not applicable to many other things.
″ I believe that in about fifty years’ time it will be possible, to programme computers, with a storage capacity of about 109, to make them play the imitation game so well that an average interrogator will not have more than 70 per cent chance of making the right identification after five minutes of questioning.”
I mean, successfully imitating a 4Chan user would technically pass. (I wrote that piece after one of Warwick’s Turing test press releases six years ago.)
I feel like if any program does nearly that well, the judges aren’t cheating enough. They should be picking things they know the computer is bad at. Like drawing something with ascii art, and asking what it is, or having it talk to a bot and seeing if the conversation goes anywhere.
If all you do is talk, then all it shows is that the computer is good at running a conversation. Maybe that just was never something that took a lot of intelligence in the first place.
I am a bit sceptical about whether or not it actually passed the Turing test. To me it looks more like a publicity stunts for the following reasons:
1) 5 minutes is a short period of time.
2) I don’t believe Turing mentioned anything about 30% . I might be wrong on this one.
3) I don’t know if the judges were properly trained. What questions did they ask? I feel like there must be plenty of questions related to IQ and creativity that a thirteen year old could answer with ease but that Eugene Goostman would struggle with. Examples: “Cow is to bull like, bitch is to ….?”, or “Once upon a time there lived a pink unicorn in a big mushroom house with three invisible potatoes. Could you finish the story for me in a creative way and explain why the unicorn ended up painting the potatoes pink?” . The idea with the Turing test is that the computer should be indistinguishable from a human (in this case a 13 year old non native english speaker). I don’t believe this criteria has been met until I see a chat transcript with reasonably hard questions.
4) Having the bot pose as a none native English speaking 13 year old might not be a violation of the rules, but I very much feel like it goes against the spirit of the Turing test. It reminds me a bit of this comic (http://existentialcomics.com/comic/15). But this is beside the point, I don’t even think the bot would pass the Ukrainian-13-year-old-boy-turing-test if it was asked reasonably hard questions.
Until I learn more about the proceedings I remain utterly unconvinced that this is the milestone in AI media portrait it to be. It is nonetheless pretty cool!
Well obviously, the unicorn did it to satisfy the ghost of Carl Sagan, who showed up at the unicorn’s house and started insisting that the potatoes weren’t there. Annoyed, she tried throwing flour on the potatoes to convince him, but it turned out the potatoes really were permeable to flour. It was touch and go for a while there, and even the unicorn started to doubt the existence of her invisible potatoes (to say nothing of her invisible garden and invisible scarecrow—but that at least had done an excellent job of keeping the invisible birds away). Eventually, though, it was found that pink paint coated the potatoes just fine, and so Carl happily went back to his post co-haunting the Pioneer 10 probe. The whole affair turned out to be a boon for the unicorn, as the pink paint put a stop to a previously unfalsifiable dragon, who had been eating her potatoes (or so she suspected—she had never been able to prove it). The dragon, for his part, simply went back to his old habit of terrorizing philosopher’s thought experiments.
Nice try, chatterbot.
I see what you did there.
The test was in fact as Turing specified. In addition to 30% being the challenge, as Stuart pointed out, Turing specified 5 minutes and an “average interrogator”.
The more interesting point here, I think, is the discovery (not very surprising by now) that a program that can pass the true Turing Test is still narrow AI not applicable to many other things.
The 30% quote is legit:
″ I believe that in about fifty years’ time it will be possible, to programme computers, with a storage capacity of about 109, to make them play the imitation game so well that an average interrogator will not have more than 70 per cent chance of making the right identification after five minutes of questioning.”
http://loebner.net/Prizef/TuringArticle.html
This was a prediction Turing made, not how the test was defined.
We can’t do it in 10^9 bits, though. Of course that’s just nitpicking.
Maybe with the best compression we can? But yeah, that’s not the main goal.
It’s Kevin Warwick, it’s completely a publicity stunt.
Exactly. I was pretty disappointed to see this posted on LW.
I mean, successfully imitating a 4Chan user would technically pass. (I wrote that piece after one of Warwick’s Turing test press releases six years ago.)
I feel like if any program does nearly that well, the judges aren’t cheating enough. They should be picking things they know the computer is bad at. Like drawing something with ascii art, and asking what it is, or having it talk to a bot and seeing if the conversation goes anywhere.
If all you do is talk, then all it shows is that the computer is good at running a conversation. Maybe that just was never something that took a lot of intelligence in the first place.