I would like to note that a pointer state is the state of a pointer of a measuring device—this is where the name comes from. For example, in the case of Schrödinger’s cat, one can construct a device that indicates whether the cat is alive or dead, thereby ensuring objectivity even in the absence of a human observer.
Moreover, such devices can rely on different measurable signals: an electroencephalogram, a cardiogram, the cat’s heat production, the amount of CO₂ it exhales, and so on. A classical device that would display a superposition of the states ⟨alive⟩ + ⟨dead⟩ cannot be constructed; therefore, such a superposition is not a pointer state. Human sensory organs are themselves such devices, as is the environment surrounding the cat: EEG and ECG signals generate electromagnetic radiation in the environment, heat production raises its temperature, and CO₂ emission increases the ambient CO₂ concentration.
The mere existence of such “devices” already makes pointer states objective, because any number of observers can look at the pointers!
Can good and evil be pointer states? And if they can, then this would be an objective characteristic, understood in the same way by both humans and AI and the alignment problem is already solved!
Here one has to be very careful with the proof of such a multiverse picture, because, as usual, we replace the observed averaging of outcomes of experiments repeated in time in our world by the squared modulus of the (normalized) amplitude interpreted as the probability of our world which effectively means averaging over an ensemble of parallel worlds, whose number since the birth of the universe may be infinite.
The explanatory idea is there, but even in the 2025 paper it still looks underdeveloped. I don’t understand this very well, so I can’t give more details.
I would like to note that a pointer state is the state of a pointer of a measuring device—this is where the name comes from. For example, in the case of Schrödinger’s cat, one can construct a device that indicates whether the cat is alive or dead, thereby ensuring objectivity even in the absence of a human observer.
Moreover, such devices can rely on different measurable signals: an electroencephalogram, a cardiogram, the cat’s heat production, the amount of CO₂ it exhales, and so on. A classical device that would display a superposition of the states ⟨alive⟩ + ⟨dead⟩ cannot be constructed; therefore, such a superposition is not a pointer state. Human sensory organs are themselves such devices, as is the environment surrounding the cat: EEG and ECG signals generate electromagnetic radiation in the environment, heat production raises its temperature, and CO₂ emission increases the ambient CO₂ concentration.
The mere existence of such “devices” already makes pointer states objective, because any number of observers can look at the pointers!
Can good and evil be pointer states? And if they can, then this would be an objective characteristic, understood in the same way by both humans and AI and the alignment problem is already solved!
If you only have unitary evolution, you end up with superpositions of the form
|system state 1> |pointer state 1> + |systems state 2> |pointer state 2> + … + small cross-terms
Are you proposing that we ignore all but one branch of this superposition?
My favorite point origins of Born’s rule of view is the following. The final state is a superposition, but we are all inside it.
And since these two states are orthogonal, state 1⟩ does not see 2⟩, and vice versa; God only knows.
The works by Zurek (https://arxiv.org/pdf/1807.02092) and the more recent one (https://arxiv.org/html/2209.08621v6) shed more light on this.
Here one has to be very careful with the proof of such a multiverse picture, because, as usual, we replace the observed averaging of outcomes of experiments repeated in time in our world by the squared modulus of the (normalized) amplitude interpreted as the probability of our world which effectively means averaging over an ensemble of parallel worlds, whose number since the birth of the universe may be infinite.
The explanatory idea is there, but even in the 2025 paper it still looks underdeveloped. I don’t understand this very well, so I can’t give more details.
This would appear to be just saying that if we can build a classical detector of good and evil, good and evil are objective in the classical sense.