The first one did seem pretty central to me.
Why, then, have we survived until now? Probably, because our local environment has been stable, at least on the timescale relevant to biological evolution and civilization development, and because of anthropic selection.
Now apply a combination of two principles. The anthropic principle tells us that we necessarily find ourselves in conditions compatible with our existence, so we should not be surprised that our local environment is friendly. The Copernican principle tells us that we are not in a special or privileged location in the space of possibilities; we should expect to be typical among observers.
Combining these yields a conclusion: we should expect to live in a minimally friendly universe, not a maximally or even an average-friendly one. The anthropic principle guarantees that our universe clears the bar for producing observers. The Copernican principle says we are typical among observer-containing universes. Since there are vastly more ways for a universe to barely clear the bar than to be friendly for humans on all levels and in all parts of the configuration space, typical means “barely clearing the bar.” We should expect our universe to be friendly enough to produce us and not much more than that.
Like the first thing the Lethal Reality hypothesis has to explain is “why are we alive at all”. And the argument given is anthropics.
do you think longer horizon rl can teach them this?
LLMs have some in context learning abilities. but I think for most of what they do (like solving IMO problems or writing one off programs), they can get by mostly relying on the knowledge in their weights.
But as RL trajectories get longer, there’s more and more pressure on the model learning things over a single rollout.