The Teacup Test

“I want to get into AI alignment,” said Xenophon.

“Why?” said Socrates.

“Because an AGI is going to destroy humanity if we don’t stop it,” said Xenophon.

“What’s an AGI?” said Socrates.

“An artificial general intelligence,” said Xenophon.

“I understand the words ‘artificial’ and ‘general’. But what is an ‘intelligence’?” said Socrates.

“An intelligence is kind of optimizer,” said Xenophon.

“Like my teacup,” said Socrates. He took a sip of iced tea.

“What?” said Xenophon.

“My teacup. It keeps my tea cold. My teacup optimizes the world according to my wishes,” said Socrates.

“That’s not what I mean at all. An optimizer cannot do just one thing. There must be an element of choice. The optimizer must take different actions under different circumstances,” said Xenophon.

“Like my teacup,” said Socrates.


“Now it is summer. But in the winter my teacup keeps my tea hot. Do you see? My teacup does have a choice. Somehow it knows to keep hot things hot and cold things cold,” said Socrates.

“A teacup isn’t intelligent,” said Xenophon.

“Why not?” said Socrates. He savored another sip.

“Because the teacup is totally passive. An intelligence must act on its environment,” said Xenophon.

“Far away, in the Levant, there are yogis who sit on lotus thrones. They do nothing, for which they are revered as gods,” said Socrates.

“An intelligence doesn’t have to act. It just needs the choice,” said Xenophon.

“Then it is impossible to tell if something is intelligent solely based on its actions,” said Socrates.

“I don’t follow,” said Xenophon.

Socrates picked up a rock. “This intelligent rock chooses to never do anything,” said Socrates.

“That’s ridiculous,” said Xenophon.

“I agree,” said Socrates, “Hence why I am so confused by the word ‘intelligent’.”

“No intelligence would choose to do nothing. If you put it in a box surely any intelligent being would attempt to escape,” said Xenophon.

“Yogis willingly box themselves in boxes called ‘monasteries’,” said Socrates.

“I see what you’re getting at. This is a case of the belief-value uncertainty principle. It’s impossible to tell from their actions whether yogis are good at doing nothing (their value function is to do nothing) or bad at doing things (their value function is to act and they are just very stupid),” said Xenophon.

Socrates nodded.

“Since it is impossible to deduce whether something is intelligent based solely on its external behavior, intelligence cannot be an external property of an object. Intelligence must be an internal characteristic,” said Xenophon.

Socrates waited.

“An intelligence is something that optimizes its external environment according to an internal model of the world,” said Xenophon.

“My teacup’s model of the world is the teacup’s own temperature,” said Socrates.

“But there’s no value function,” said Xenophon.

“Sure there is. ‘Absolute distance from room temperature’ is the function my teacup optimizes,” said Socrates.

“Your teacup is too passive,” said Xenophon.

“‘Passivity’ was not part of your definition. But nevermind that. Suppose I built a machine with artificial skin that felt the temperature of the cup and added ice to cold cups and lit a fire under hot cups. Surely such a machine would be intelligent,” said Socrates.

“Not at all! You just programmed the machine to do what you want. It’s all hard-coded,” said Xenophon.

“So whether something is intelligent doesn’t depend on what’s inside of it. Intelligence has to do with whether something was designed. If the gods carefully designed human beings to do their bidding then us human beings would not be intelligent,” said Socrates.

“That’s not what I meant at all!” said Xenophon.

“Then what did you mean?” said Socrates.

“Let’s start over, tabooing both the words ‘intelligent’ and ‘optimizer’. A ‘Bayesian agent’ is something that creates a probability distribution over world models based on its sensory inputs and then takes an action according to a value function,” said Xenophon.

“That doesn’t pass the teacup test. Under that definition my teacup qualifies as a ‘Bayesian agent,’” said Socrates.

“Oh, right,” said Xenophon, “How about ‘Systems that would adapt their policy if their actions would influence the world in a different way’?”

“Teacup test,” said Socrates.

“So are you saying the entire field of AI Alignment is bunk because intelligence isn’t a meaningful concept?” said Xenophon.

“Maybe. Bye!” said Socrates.

“No! Wait! That was a joke!” said Xenophon.