# Some problems with making induction benign, and approaches to them

The universal prior is malign. I’ll talk about sequence of problems causing it to be malign and possible solutions.

(this post came out of conversations with Scott, Critch, and Paul)

Here’s the basic setup of how an inductor might be used. At some point humans use the universal prior to make an important prediction that influences our actions. Say that humans construct an inductor, give it a sequence of bits from some sensor, and ask it to predict the next bits. Those bits are actually (in a way inferrable from ) going to be generated by a human in a sealed room who thinks for a year. After the human thinks for a year, the bits are fed into the inductor. This way, the humans can use the inductor to predict what the human who thinks for a year will say ahead of time. (This probably isn’t the best way to use an inductor but it serves as a good demonstration).

# The anthropic update is a problem for Solomonoff induction

The original problem Paul talks about in his post is that consequentialist agents can make an “anthropic update” on the fact that the string is fed into a Solomonoff inductor with a specific prior, while Solomonoff induction can’t make this update on its own. Interestingly, this problem with Solomonoff induction is also a problem with sufficiently large convolutional neural networks (capable of simulating e.g. game of life).

Possibly, this could be solved by inventing a naturalized variant of Solomonoff induction that makes this anthropic update.

# Simulation warfare is a problem for naturalized induction

Suppose alien civilizations devote some percentage of their resources to (a) coming up with plausible values for the prefix and (b) constructing naturalized inductors that are fed the string followed by bits such that the aliens would be able to take over the humans’ system if these bits were the prediction of the humans’ inductor. (For example, these bits could be a manipulative message).

Under this supposition, a large percentage of naturalized inductors in the universe are of this form. In particular, if the aliens are good at guessing , then a large percentage of naturalized inductors that are fed a string staring with the prefix will be of this form. So the naturalized inductor used by the humans will, under reasonable anthropic theories, predict a message chosen by some alien civilization with high probability.

# Simulations can be replaced with zoos

Plausibly, there’s some way to tell that the aliens’ simulations of humans are “not real”. But what if the “simulations” are actually real physical planets? Since there are many galaxies, it would not be hard to use some of them to create “zoos” similar to planets that life naturally evolves on. The aliens can deploy nanobots in the zoos to influence what the human in the box says. This way, on most physical Earths, the naturalized inductor will be fed bits chosen by aliens when these bits are used to make an important decision.

# Does importance-weighting rescue naturalized induction?

Maybe we should somehow distinguish “natural” Earths from “zoo” Earths. One way they are distinguished is that an individual “natural” Earth will have more influence over the universe than an individual “zoo” Earth. So maybe there is a sense in which we can make an importance-weighted naturalized inductor, which assigns higher weight to naturalized inductors used to make decisions that influence a large part of the universe.