The utility of information should almost never be negative

As humans, finding out facts that we would rather not be true is unpleasant. For example, I would dislike finding out that my girlfriend were cheating on me, or finding out that my parent had died, or that my bank account had been hacked and I had lost all my savings.

However, this is a consequence of the dodgily designed human brain. We don’t operate with a utility function. Instead, we have separate neural circuitry for wanting and liking things, and behave according to those. If my girlfriend is cheating on me, I may want to know, but I wouldn’t like knowing. In some cases, we’d rather not learn things: if I’m dying in hospital with only a few hours to live, I might rather be ignorant of another friend’s death for the short remainder of my life.

However, a rational being, say an AI, would never rather not learn something, except for contrived cases like Omega offering you $100 if you can avoid learning the square of 156 for the next minute.

As far as I understand, an AI with a set of options decides by using approximately the following algorithm. This algorithm uses causal decision theory for simplicity.

“For each option, guess what will happen if you do it, and calculate the average utility. Choose the option with the highest utility.”

So say Clippy is using that algorithm with his utility function of utility = number of paperclips in world.

Now imagine Clippy is on a planet making paperclips. He is considering listening to the Galactic Paperclip News radio broadcast. If he does so, there is a chance he might hear about a disaster leading to the destruction of thousands of paperclips. Would he decide in the following manner?

“If I listen to the radio show, there’s maybe a 10% chance I will learn that 1000 paperclips were destroyed. My utility in from that decision would be on average reduced by 100. If I don’t listen, there is no chance that I will learn about the destruction of paperclips. That is no utility reduction for me. Therefore, I won’t listen to the broadcast. In fact, I’d pay up to 100 paperclips not to hear it.”

Try and figure out the flaw in that reasoning. It took me a while to spot it, but perhaps I’m just slow.

* thinking space *

For Clippy to believe “If I listen to the radio show, there’s maybe a 10% chance I will learn that 1000 paperclips were destroyed,” he must also believe that there is already a 10% chance that 1000 paperclips have been destroyed. So his utility in either case is already reduced by 100. If he listens to the radio show, there’s a 90% chance his utility will increase by 100, and a 10% chance it will decrease by 900, relative to ignorance. And so, he would be indifferent to gaining that knowledge.

As humans, we don’t work that way. We don’t constantly feel the pressure of knowledge like “People might have died since I last watched the news,” just because humans don’t deal with probability in a rational manner. And also, as humans who feel things, learning about bad things is unpleasant in itself. If I were dying in my bed, I probably wouldn’t even think to increase my probability that a friend had died just because no-one would have told me if they had. An AI probably would.

Of course, in real life, information has value. Maybe Clippy needs to know about these paperclip-destroying events in order to avoid them in his own paperclips, or he needs to be updated on current events to participate in effective socialising with other paperclip enthusiasts. So he would probably gain utility on average from choosing to listen to the radio broadcast.

In conclusion. An AI may prefer the world in one state than another, but almost always prefers more knowledge about the actual state of the world, even if what it learns isn’t good.