SHRDLU, understanding, anthropomorphisation and hindsight bias

EDIT: Since I didn’t make it sufficiently clear, the point of this post was to illustrate how the GOFAI people could have got so much wrong and yet still be confident in their beliefs, by looking at what the results of one experiment—SHRDLU—must have felt like to those developers at the time. The post is partially to help avoid hindsight bias: it was not obvious that they were going wrong at the time.

SHRDLU was an early natural language understanding computer program, developed by Terry Winograd at MIT in 1968–1970. It was a program that moved objects in a simulated world and could respond to instructions on how to do so. It caused great optimism in AI research, giving the impression that a solution to natural language parsing and understanding were just around the corner. Symbolic manipulation seemed poised to finally deliver a proper AI.

Before dismissing this confidence as hopelessly naive (which it wasn’t) and completely incorrect (which it was), take a look at some of the output that SHRDLU produced, when instructed by someone to act within its simulated world:

Person: Pick up a big red block.
Computer: OK.
Person: Grasp the pyramid.
Computer: I don’t understand which pyramid you mean.
Person (changing their mind): Find a block which is taller than the one you are holding and put it into the box.
Computer: By “it”, I assume you mean the block which is taller than the one I am holding.
Computer: OK.
Person: What does the box contain?
Computer: The blue pyramid and the blue block.
Person: What is the pyramid supported by?
Computer: The box.
Person: How many blocks are not in the box?
Computer: Four of them.
Person: Is at least one of them narrower than the one which I told you to pick up?
Computer: Yes, the red cube.

It’s hard not to project an intelligence into those responses. Humans are the main producers of language, so we’re probably primed to anthropomorphise anything that speaks to us this way; add to that SHRDLU’s seeming ability to cope with some level of ambiguity (“Is at least one of them...”—one of what?), its very human turns of phrase (“By “it”, I assume you mean...”, “I don’t understand...”) and you can see why naive outsiders could believe that SHRDLU might show genuine understanding. It would be natural to give SHRDLU the benefit of the doubt, and assume it was far more complex that it really was.

We can also see why smart insiders might also believe that. Despite Dreyfuss’s critique of AI, many AI researchers hadn’t yet grasped how badly symbolic manipulation systems would fail in real-world ambiguous situations. To them, SHRDLU’s performance would seem as confirmation, not that SHRDLU was very complicated (since they knew well how complex it was), but that understanding wasn’t that complicated (since SHRDLU seemed to demonstrate that).

I would posit that this wasn’t an unreasonable belief at the time. They had a product—AI—that had demonstrated high-seeming performance in controlled tests (a proof of concept, if you will), and that they were hoping to develop for more general usage. Indeed, how do we know that SHRDLU failed to show some form of true understanding? Mainly because the approach failed in more general situations. Had symbolic AI gone on and succeeded in passing Turing tests, for example, then we probably would have concluded that “SHRDLU was an early example of AIs with understanding.”

But of course, at the time, researchers didn’t have the crucial information “your field will soon become a dead-end where genuine AI is concerned.” So their belief in SHRDLU and in the whole symbolic approach was not unreasonable—though, like everyone, they were overconfident.