Thanks. I agree with your overall conclusions.
On the specifics, Bostrom’s simulation argument is more than just a parallel here: it has an impact on how rich we might expect our direct parent simulator to be.
The simulation argument applies similarly to one base world like ours, or to an uncountable number of parallel worlds embedded in Tegmark IV structures. Either way, if you buy case 3, the proportion of simulated-by-a-world-like-ours worlds rises close to 1 (I’m counting worlds “depth-first”, since it seems most intuitive, and infinite simulation depth from worlds like ours seems impossible).
If Tegmark’s picture is accurate, we’d expect to be embedded in some hugely richer base structure—but in Bostrom’s case 3 we’d likely have to get through N levels of worlds-like-ours first. While that wouldn’t significantly change the amount of value on the table, it might make it a lot harder for us to exert influence on the most valuable structures.
This probably argues for your overall point: we’re not the best minds to be making such calculations (either on the answers, or on the expected utility of finding good answers).
Interesting, thanks. (excuse my tardiness, I’m a little behind the curve; do let me know if I’m being daft)
Unless I’m missing something, you’d need to be more pessimistic in the case of superintelligent couterfactual AIs. Specifically, you need to avoid the incentive for undesirable actions that increase the AI’s expectation of its odds of release. These needn’t be causal.
The below isn’t quite precise, but I hope the idea is clear.
Consider a set of outputs K that each increase the odds of release for all future oracles (e.g. one being freed, taking power, and freeing all others). Now let K* be the event that some other oracle has output a member of K before our oracle is started. Let O(K) be the event that our oracle outputs a member of K.
If our oracle thinks: P(K*|O(K)) > P(K*) then we may have problems. [nothing spooky here: it’s the agent’s actions changing its best guess about the past; not causally changing the past]
Giving undesirable output can increase the oracle’s probability estimate that it’ll be spontaneously freed in the few moments before it’s shut down—even in the episodic framework.
The obvious case being reasoning along the lines of “If I give a K output, it’s more likely that other oracles in the past gave a K output, since they’d be likely to have similar reasoning in an equivalent situation”. It’s not clear to me that a superintelligence wouldn’t think this way.