Noosphere89 comments on Terrified Comments on Corrigibility in Claude’s Constitution

Noosphere89 16 Mar 2026 14:39 UTC
6 points
3
I basically agree with the intended point that general intelligence in a compute-limited world is necessarily complicated (and think that a lot of people are way too invested in trying to simplify the brain into the complexity of physics), but I do think you are overselling the similarities between deep learning and the brain, and in particular you are underselling the challenge of actually updating the model, mostly because unlike current AIs, humans can update their weights at least once a day always, and in particular there’s no training date cutoff after which the model isn’t updated anymore, and in practice human weight updates almost certainly have to be done all the time without a training and test separation, whereas current AIs do update their weights, but it lasts only a couple of months in training and then the weights are frozen and served to customers.

(For those in the know, this is basically what people mean when they talk about continual learning).

So while there are real similarities, there are also differences.

Because most of the world is actually complication. This is another thing Alan Kay talks about — the complexity curve versus the complication curve. If you have physics brain, you model the world as being mostly fundamental complexity with low Kolmogorov complexity, and you expect some kind of hyperefficient Solomonoff induction procedure to work on it. But if you have biology brain or history brain, you realize that the complication curve of the outcomes implied by the rules of the cellular automaton that is our reality is vastly, vastly bigger than the fundamental underlying complexity of the basic rules of that automaton.

Another way to put this, if you’re skeptical: the actual program size of the universe is not just the standard model. It is the standard model plus the gigantic seed state after the Big Bang. If you think of it like that, you realize the size of this program is huge. And so it’s not surprising that the model you need to model it is huge, and that this model quickly becomes very difficult to interpret due to its complexity.

I would slightly change this, and say that if you can’t brute-force simulate the universe based on it’s fundamental laws, you must take into account the seed, but otherwise a very good point that is unheeded by a lot of people (the change doesn’t matter for AI capabilities in the next 50-100 years, and it also doesn’t matter for AI alignment with p(0.9999999), but does matter from a long-term perspective on the future/longtermism.
- espoire 21 Mar 2026 8:35 UTC
  3 points
  0
  Parent
  Re: no human training/test separation:
  Epistemic status: random thought I just had, but what if there kind of is. I think maybe dreaming is the “test” part of the training cycle: the newly updated weights run against outcome predictions supplied by parts of the system not currently being updated. The being-updated part tries to get desirable outcomes within the dream, and another network / region plays Dungeon Master, supplying scenario and outcomes for given actions. Test against synthetic test data, supplied by a partially adversarial network.
  I feel like, if true, we’d expect to see some kind of failures to learn-from-sleep in habitual lucid dreamers? Or reduced efficacy, anyway? I wonder what happens in a learning setup which is using test performance to make meta training decisions, if you hack the test results to erroneously report greater-than-actual performance…? Are there people who do not dream at all (as distinguished from merely not remembering dreams)?
  This model of “what even is a dream, anyway?” makes a lot more predictions/retrodictions than my old model of “dreams are just the qualia of neuronal sub populations coming back online as one wakes up”.