Is it your position that, if an Earth-sized planet had never been hit by an meteor before, but now the smart alien inhabitants see in their telescopes that a Moon-sized meteor is heading at high speed straight for the planet, they should say P(extinction) is unknowable?
If so, that’s kinda crazy right? They have (let us assume) excellent knowledge of physics, geology, and so on. They can apply that scientific knowledge to figure out what happens when giant high-speed projectiles hit planets, even if they haven’t seen that happen before. Yes it’s hard, and yes they might get it wrong in practice, but you seem to be making a stronger statement that it’s unknowable in principle, right?
Back to the AI case, I would say strong evidence is common, and we can just look around us and learn an enormous amount about algorithms, about geopolitics, about economics, about everything else. There is way more than enough information in the everyday world today there for an ideal reasoner to wash away all of their priors with any bearing on the question of AI doom. AI doom is not an isolated question. I mean, just go actually look at the people arguing for and against AI doom. What do they say? You’ll notice that they are making arguments based on underlying models of how the world works (algorithms, economics, etc.), and then the other person questions those models based on other aspects of the world, and so on. I think it’s highly implausible to say that a theoretical ideal reasoner with infinite patience and care would never be able to get to the bottom of this discourse, based only on presently-available information. I mean, as an existence proof, a theoretical ideal reasoner with infinite patience and care would be able to do an atom-by-atom simulation “in their head” of a million different civilizations building superintelligence, right?
Hm, no, I wouldn’t say P(extinction) is unknowable in that case, and I explicitly agree with this notion in the essay. It’s the first example I use when discussing canonical probability. The smart aliens might be facing an unprecedented event, but not an unidentified generative mechanism. Meteor mechanisms would be observed, whether by smart aliens or humans by physics and gravity observations from other things. So when they get new data, the data is cleanly discriminated between models and the impact parameter is point identified.
AI doom/utopia is not like that, it is structurally different. Disagreements aren’t about parameter values of agreed upon mechanism, it’s a battle of different causal ontologies. The generative mechanisms are the thing in dispute, which leads to your second point.
You say “There is way more than enough information in the everyday world today there for an ideal reasoner to wash away all of their priors with any bearing on the question of AI doom.” But Bayesianism requires more than just lots of data, it requires discriminating data. If hypothesis A (doom via deceptive alignment) and hypothesis B (safety via institutional equilibrium) both perfectly predict the exact same present-day prefix of algorithms and geopolitics, then the likelihood ratio between them is exactly 1. While frameworks predict the prefix with different accuracy on some dimensions, successful mechanistic models are modular. Once both composite models predict the observed prefix identically, the likelihood ratio on the catastrophic parameter will be effectively 1 and Bayes’ theorem says your posterior is identical to the prior.
I agree that people debate about every day data to argue for their models, but in the essay this is exactly what I formalize as Differential Screening. When new data causes frameworks to update in different directions, they have misaligned causal joints and everyday data does not force an asymptote of convergence. That is the point I was making in showing that it forces a bounded-increment random walk. To be clear: more data doesn’t automatically wash away priors when the directional derivative of that data is exactly what is under dispute.
Finally, the notion of an atom-by-atom simulator conflates infinite compute with infinite discriminatory data. The whole scaffolding of the essay already assumes an ideal reasoner with infinite compute.
An ideal Bayesian agent (like Solomonoff induction) doesn’t know the true causal transition matrix of the universe. Instead, it weights a mixture of all possible computable universes that perfectly output the data prefix observed so far. Infinite compute won’t let you magically manufacture missing discriminating data, it will allow the reasoner to calculate its prior with infinite precision.
Even if we grant a miracle: say your reasoner is literally Laplace’s Demon. It perfectly simulates the exact position of all of the atoms at some year in the future. Now what? Is that arrangement of atoms “doom”?
Again, my point is that AI risk and other unprecedented catastrophes like that are trapped by the information geometry problem itself. The best solution I can think of is to establish agreed precursors with aligned causal joints. Otherwise, the ideal reasoner will eventually realize the event predicate requires solving the Halting problem and output what the essay predicts: a syntax error.
Is it your position that, if an Earth-sized planet had never been hit by an meteor before, but now the smart alien inhabitants see in their telescopes that a Moon-sized meteor is heading at high speed straight for the planet, they should say P(extinction) is unknowable?
If so, that’s kinda crazy right? They have (let us assume) excellent knowledge of physics, geology, and so on. They can apply that scientific knowledge to figure out what happens when giant high-speed projectiles hit planets, even if they haven’t seen that happen before. Yes it’s hard, and yes they might get it wrong in practice, but you seem to be making a stronger statement that it’s unknowable in principle, right?
Back to the AI case, I would say strong evidence is common, and we can just look around us and learn an enormous amount about algorithms, about geopolitics, about economics, about everything else. There is way more than enough information in the everyday world today there for an ideal reasoner to wash away all of their priors with any bearing on the question of AI doom. AI doom is not an isolated question. I mean, just go actually look at the people arguing for and against AI doom. What do they say? You’ll notice that they are making arguments based on underlying models of how the world works (algorithms, economics, etc.), and then the other person questions those models based on other aspects of the world, and so on. I think it’s highly implausible to say that a theoretical ideal reasoner with infinite patience and care would never be able to get to the bottom of this discourse, based only on presently-available information. I mean, as an existence proof, a theoretical ideal reasoner with infinite patience and care would be able to do an atom-by-atom simulation “in their head” of a million different civilizations building superintelligence, right?
Hm, no, I wouldn’t say P(extinction) is unknowable in that case, and I explicitly agree with this notion in the essay. It’s the first example I use when discussing canonical probability. The smart aliens might be facing an unprecedented event, but not an unidentified generative mechanism. Meteor mechanisms would be observed, whether by smart aliens or humans by physics and gravity observations from other things. So when they get new data, the data is cleanly discriminated between models and the impact parameter is point identified.
AI doom/utopia is not like that, it is structurally different. Disagreements aren’t about parameter values of agreed upon mechanism, it’s a battle of different causal ontologies. The generative mechanisms are the thing in dispute, which leads to your second point.
You say “There is way more than enough information in the everyday world today there for an ideal reasoner to wash away all of their priors with any bearing on the question of AI doom.” But Bayesianism requires more than just lots of data, it requires discriminating data. If hypothesis A (doom via deceptive alignment) and hypothesis B (safety via institutional equilibrium) both perfectly predict the exact same present-day prefix of algorithms and geopolitics, then the likelihood ratio between them is exactly 1. While frameworks predict the prefix with different accuracy on some dimensions, successful mechanistic models are modular. Once both composite models predict the observed prefix identically, the likelihood ratio on the catastrophic parameter will be effectively 1 and Bayes’ theorem says your posterior is identical to the prior.
I agree that people debate about every day data to argue for their models, but in the essay this is exactly what I formalize as Differential Screening. When new data causes frameworks to update in different directions, they have misaligned causal joints and everyday data does not force an asymptote of convergence. That is the point I was making in showing that it forces a bounded-increment random walk. To be clear: more data doesn’t automatically wash away priors when the directional derivative of that data is exactly what is under dispute.
Finally, the notion of an atom-by-atom simulator conflates infinite compute with infinite discriminatory data. The whole scaffolding of the essay already assumes an ideal reasoner with infinite compute.
An ideal Bayesian agent (like Solomonoff induction) doesn’t know the true causal transition matrix of the universe. Instead, it weights a mixture of all possible computable universes that perfectly output the data prefix observed so far. Infinite compute won’t let you magically manufacture missing discriminating data, it will allow the reasoner to calculate its prior with infinite precision.
Even if we grant a miracle: say your reasoner is literally Laplace’s Demon. It perfectly simulates the exact position of all of the atoms at some year in the future. Now what? Is that arrangement of atoms “doom”?
Again, my point is that AI risk and other unprecedented catastrophes like that are trapped by the information geometry problem itself. The best solution I can think of is to establish agreed precursors with aligned causal joints. Otherwise, the ideal reasoner will eventually realize the event predicate requires solving the Halting problem and output what the essay predicts: a syntax error.