I’m really glad that this post is addressing the disjunctivity of AI doom, as my impression is that it is more of a crux than any of the reasons in https://www.lesswrong.com/posts/uMQ3cqWDPHhjtiesc/agi-ruin-a-list-of-lethalities.
Still, I feel like this post doesn’t give a good argument for disjunctivity. To show that the arguments for a scenario with no outside view are likely, it takes more than just describing a model which is internally disjunctive. There needs to be some reason why we should strongly expect there to not be some external variables that could cause the model not to apply.
Some examples of these, in addition to the competence of humanity, are that deep learning could hit a wall for decades, Moore’s Law could come to a halt, some anti-tech regulation could cripple AI research, or alignment could turn out to be easy (which itself contains several disjunctive possibilities). I haven’t thought about these, and don’t claim that any of them are likely, but the possibility of these or other unknown factors invalidating the model prevents me from updating to a very high P(doom). Some of this comes from it just being a toy model, but adding more detail to the model isn’t enough to notably reduce the possibility of the model being wrong from unconsidered factors.
A statement I’m very confident in is that no perpetual motion machines will be developed in the next century. I could make some disjunctive list of potential failure modes a perpetual motion machine could encounter, and thus conclude that their development is unlikely, but this wouldn’t describe the actual reason a perpetual motion machine is unlikely. The actual reason is that I’m aware of certain laws of physics which prevent any perpetual motion machines from working, including ones with mechanisms wildly beyond my imagination. The outside view is another tool I can use to be very confident: I’m very confident that the next flight I take won’t crash, not because of my model of planes, but because any crash scenario which non-negligible probability would have caused some of the millions of commercial flights every year to crash, and that hasn’t happened. Avoiding AGI doom is not physically impossible and there is no outside view against it, and without some similarly compelling reason I can’t see how very high P(doom) can be justified.
It’s really a class of algorithms, depending on how your opponent bargains, such that if the fair bargain (by your standard of fairness) gives X utility to you and Y utility to your partner, then you refuse to accept any other solution which gives your partner at least Y utility in expectation. So if they give you a take-it-or-leave-it offer which gives you positive utility and them Y’>Y utility, then you accept it with probability Y/Y’ - ϵ, such that their expected value from giving you that offer is Y - ϵ,. If they have a different standard of fairness which gives you X’ utility and them Y’ utility but also use Adabarian bargaining, then you should agree to a bargain which gives you X’ - ϵ, utility and them Y - ϵ, utility (this is always possible via randomizing over their bargaining solution, your bargaining solution, and not trading, so long as all the bargaining solutions give positive utility to everyone).
Sorry, that should actually be Pareto bargaining solution, which is a just a solution which ends up on the Pareto frontier. In The Pareto World Liars Prospers is a good explainer, and https://www.jstor.org/stable/1914235 shows a general result that every bargaining solution which is invulnerable to strategic dishonesty is equivalent to a lottery over dictatorships (where one person gets to choose their ideal solution) and tuple methods (where the possible outcomes are restricted to a set of two).
I agree with this, but also it would be pretty great to have a legal system which would work if people in power didn’t abuse their authority; I don’t think any current legal system even has that. Designing methods robust to strategic manipulation is an important part of the problem, but not the only part, and I don’t think it’s unreasonable focus on other parts, especially since there are a lot of scenarios where approximating your partner’s utility function is possible. In particular, if monetary value can be assigned to everything being bargained over, then approximating utility as money is usually reasonable.