.
EuanMcLean
Deconfusing Landauer’s Principle
Thanks for the comment, this is indeed an important component! I’ve added a couple of sentences pointing in this direction.
Fixed, thanks!
Sorry for the delay. As both you and TheMcDouglas have mentioned; yea, this relies on $H(C|X) = 0$. The way I’ve worded it above is somewhere between misleading and wrong, have modified. Thanks for pointing this out!
Yea, I think you’re hitting on a weird duality between setting and erasing here. I think I agree that setting is more fundamental than erasing. I suppose when talking about energy expenditure of computation, each set bit must be erased in the long run, so they’re interchangeable in that sense.
Big Picture AI Safety: Introduction
What will the first human-level AI look like, and how might things go wrong?
What should AI safety be trying to achieve?
What mistakes has the AI safety movement made?
Interesting question! I’m afraid I didn’t probe the cruxes of those who don’t expect hard takeoff. But my guess is that you’re right—no hard takeoff ~= the most transformative effects happen before recursive self-improvement
Searching for phenomenal consciousness in LLMs: Perceptual reality monitoring and introspective confidence
Thanks Felix!
This is indeed a cool and surprising result. I think it strengthens the introspection interpretation, but without a requirement to make a judgement of the reliability of some internal signal (right?), it doesn’t directly address the question of whether there is a discriminator in there.
Thanks James!
One failure mode is that the modification makes the model very dumb in all instances.
Yea, good point. Perhaps an extra condition we’d need to include is that the “difficulty of meta-level questions” should be the same before and after the modification—e.g. - the distribution over stuff it’s good at and stuff its bad at should be just as complex (not just good at everything or bad at everything) before and after
Thanks for the feedback Garrett.
This was intended to be more of a technical report than a blog post, meaning I wanted to keep the discussion reasonably rigorous/thorough. Which always comes with the downside of it being a slog to read, so apologies for that!
I’ll write a shortened version if I find the time!
Two flavors of computational functionalism
Is the mind a program?
Especially if it’s something as non-committal as “this mechanism could maybe matter”. Does that really invalidate the neuron doctrine?
I agree each of the “mechanisms that maybe matter” are tenuous by themselves, the argument I’m trying to make here is hits-based. There are so many mechanisms that maybe matter, the chances of one of them mattering in a relevant way is quite high.
Yes, perfect causal closure is technically impossible, so it comes in degrees. My argument is that the degree of causal closure of possible abstractions in the brain is less than one might naively expect.
Are there any measures of approximate simulation that you think are useful here?
I am yet to read this but I expect it will be very relevant! https://arxiv.org/abs/2402.09090
The statement I’m arguing against is:
Practical CF: A simulation of a human brain on a classical computer, capturing the dynamics of the brain on some coarse-grained level of abstraction, that can run on a computer small and light enough to fit on the surface of Earth, with the simulation running at the same speed as base reality, would cause the conscious experience of that brain.
i.e., the same conscious experience as that brain. I titled this “is the mind a program” rather than “can the mind be approximated by a program”.
Whether or not a simulation can have consciousness at all is a broader discussion I’m saving for later in the sequence, and is relevant to a weaker version of CF.
I’ll edit to make this more clear.
fixed, thanks!