Lucius Bushnaq comments on Changing my mind about Christiano’s malign prior argument

Lucius Bushnaq 4 Apr 2025 15:54 UTC
2 points
0
You also want one that generalises well, and doesn’t do preformative predictions, and doesn’t have goals of its own. If your hypotheses aren’t even intended to be reflections of reality, how do we know these properties hold?
Because we have the prediction error bounds.
When we compare theories, we don’t consider the complexity of all the associated approximations and abstractions. We just consider the complexity of the theory itself.
E.g. the theory of evolution isn’t quite code for a costly simulation. But it can be viewed as set of statements about such a simulation. And the way we compare the theory of evolution to alternatives doesn’t involve comparing the complexity of the set of approximations we used to work out the consequences of each theory.
Yes.
- Jeremy Gillen 4 Apr 2025 16:25 UTC
  2 points
  0
  Parent
  To respond to your edit: I don’t see your reasoning, and that isn’t my intuition. For moderately complex worlds, it’s easy for the description length of the world to be longer than the description length of many kinds of inductor.
  Because we have the prediction error bounds.
  Not ones that can rule out any of those things. My understanding is that the bounds are asymptotic or average-case in a way that makes them useless for this purpose. So if a mesa-inductor is found first that has a better prior, it’ll stick with the mesa-inductor. And if it has goals, it can wait as long as it wants to make a false prediction that helps achieve its goals. (Or just make false predictions about counterfactuals that are unlikely to be chosen).
  If I’m wrong then I’d be extremely interested in seeing your reasoning. I’d maybe pay $400 for a post explaining the reasoning behind why prediction error bounds rule out mesa-optimisers in the prior.
  - Lucius Bushnaq 4 Apr 2025 16:44 UTC
    2 points
    0
    Parent
    The bound is the same one you get for normal Solomonoff induction, except restricted to the set of programs the cut-off induction runs over. It’s a bound on the total expected error in terms of CE loss that the predictor will ever make, summed over all datapoints.
    
    Look at the bound for cut-off induction in that post I linked, maybe? Hutter might also have something on it.
    Can also discuss on a call if you like.
    
    Note that this doesn’t work in real life, where the programs are not in fact restricted to outputting bit string predictions and can e.g. try to trick the hardware they’re running on.
    - Jeremy Gillen 4 Apr 2025 17:35 UTC
      2 points
      0
      Parent
      Yeah I know that bound, I’ve seen a very similar one. The problem is that mesa-optimisers also get very good prediction error when averaged over all predictions. So they exist well below the bound. And they can time their deliberately-incorrect predictions carefully, if they want to survive for a long time.