Indeed, that adds an extra layer of complication.
So this works ∑∫λΔenormal text∇⇔enotnormaltext, does it?
Yay! And this is copy-pastable! Have a huge upvote.
I tend to see this as an issue of decision theory, not probability theory. So if causality doesn’t work in a way we can understand, the situation is irrelevant (note that some backwards-running brains will still follow an understandable causality from within themselves, so some backwards-running brains are decision-theory relevant).
>assuming there are a large number of copies of some algorithm A then there is more utility if A has such and such properties.
This is only relevant if this results in a change in algorithm A. eg causal decision theory can know that if it was a UDT agent, then it would have more money in the Newcomb problem, but it won’t change itself because of this (if Omega decided before the agent existed).
On the second point: I see Boltzmann brains as issues of decision theory, not probability theory, so I’m not worried about probability issues with them.
I have opinions on this kind of reasoning that I will publish later this month (hopefully), around issues of syntax and semantics.
Let V be the hyper-volume where the probability of a Mkg BB is exactly exp[−M×1069]. Let’s imagine a sequence of V’s stretching forward in time. About exp[−1069] of them will contain one BB of mass 1 kg, and about exp[−2×1069] will contain a BB of mass 2kg, which is also the proportion that contains two brains of mass 1kg.
So I think you are correct; most observer-moments will still be in short-lived BBs. But if you are in an area with disproportionately many observer moments, then they are more likely to be in long-lived BBs. I will adjust the post to reflect this.
>changing to Markdown should fix a lot of the problems.
But then I wouldn’t have LaTeX! ^_^
>They may numerically dominate, but additional calculations are needed and seem possible.
Over the far future of the universe (potentially infinite), we inhabit an essentially empty de Sitter space without gas for thermodynamical BBs (except for gas also created by nucleation).
Ah yes, but if you start assuming that the standard model is wrong and start reasoning from the “what kind of reality might be simulating us”, the whole issue gets much, much more complicated. And your priors tend to do all the work in that case.
Hey there, thanks for the heads up—I edited it on the phone, and it went wonky. Have corrected it now.
Let’s try and tease out the disagreement. I mentioned two seemingly valid approaches, that would lead to different beliefs for the human, and asked how the AI could choose between them. You then went up a level of meta, to preferences over the deliberative process itself.
But I don’t think the meta preferences are more likely to be consistent—if anything, probably less so. And the meta-meta-preferences are likely to be completely underdefined, except in a few philosophers.
So I see the AI as having to knowingly decide between multiple different possible preferences, meta-preferences, etc… about the whole definition of what corrigibility means. And then imposing those preferences on humans, because it has to impose something.
Doing corrigibility without keeping an eye on the outcome seems, to me, to be similar to many failed AI safety approach—focusing on the local “this sounds good”, rather than on the global “but it may cause extinction of sentient life”.
(There is also the side issue that corrigibility involves communicating certain facts to the human, and engaging with them. This may result in the human being manipulated to engage in the exchange in a more corrigible way; if this is true, then some manipulation may be inevitable, so aiming to remove it would be impossible.)
This can imply a few things:
Corrigibility could be underdefined (at least for humans).
Though we are assuming that neither the AI nor the human is supposed to look at the conclusion, this may just result in either a random walk, or an optimisation pressure by hidden processes inside the definition.
Therefore it may be better to explicitly take the outcome into account as well.
And, possibly, we *may* need to care about corrigibility/influence in either case.
Hum, why use a black hole when you could have matter and anti-matter to react directly when needed?
You can set it up in the same way as in the previous example—Petrov has two sets of underdeveloped preferences (say, for human survival and for loyalty) and has not chosen between them, and the AI’s actions will force him to choose one or the other.