General intelligence test: no domains of stupidity

It’s been a productive conversation on my post criticising the Turing test. I claimed that I wouldn’t take the Turing test as definitive evidence of general intelligence if the agent was specifically optimised on the test. I was challenged as to whether I had a different definition of thinking than “able to pass the Turing test”. As a consequence of that exchange, I think I do.

Truly general intelligence is impossible, because of various “no free lunch” theorems, that demonstrate that no algorithm can perform well in every environment (intuitively, this makes sense: a smarter being could always design an environment that specifically penalises a particular algorithm). Nevertheless, we have the intuitive definition of a general intelligence as one that performs well in most (or almost all) environments.

I’d like to reverse that definition, and define a general intelligence as one that doesn’t perform stupidly in a novel environment. A small change of emphasis, but it gets to the heart of what the Turing test is meant to do, and why I questioned it. The idea of the Turing test is to catch the (putative) AGI performing stupidly. Since we can’t test the AGI on every environment, the idea is to have the Turing test be as general as possible in potential. If you give me the questions in advance, I can certainly craft an algorithm that aces that test; similarly, you can construct an AGI that would ace any given Turing test. But since the space of reasonable conversations is combinatorially huge, and since the judge could potentially pick any element from within that, the AGI could not just have a narrow list of responses: it would have to be genuinely generally intelligent, so that it would not end up being stupid on the particular conversation it was in.

That’s the theory, anyway. But maybe the space of conversations isn’t as vast as all that, especially if the AGI has some simple classification algorithms. Maybe the data on the internet today, combined with some reasonably cunning algorithms, can carry a conversation as well as a human. After all, we are generating examples of conversations by the millions every hour of every day.

Which is why I emphasised testing from outside the domain of competence of the AGI. You need to introduce it to a novel environment, and give it the possibility of being stupid. If the space of human conversations isn’t large enough, you need to move to the much larger space of real-world problem solving—and pick something from it. It doesn’t matter what it is, simply that you have the potential of picking anything. Hence only a general intelligence could be confident, in advance, of coping with it. That’s why I emphasised not saying what your test was going to be, and changing the rules or outright cheating: the less restrictions you allow on the potential test, the more informative the actual test is.

A related question, of course, is whether humans are generally intelligent. Well, humans are stupid in a lot of domains. Human groups augmented by data and computing technology, and given enough time, are much more generally intelligent that individual humans. So general intelligence is a matter of degree, not a binary classification (though it might be nearly binary for some AGI designs). Thus whether you call humans generally intelligent is a matter of taste and emphasis.