But this isn’t the source of shouldness we are looking for. Buried deep in the human mind is a legitimate utility function, or at least something like one, which summarizes that human’s terminal values
No. It’s more that if you extrapolate out the preferences we already have, asking what we would prefer if we had time for our chaotic preferences to resolve themselves, then you end up with a superior sort of shouldness to which our present preferences might well defer. Sort of like if you knew that your future self would be a vegetarian, you might regard your present consumption of meat as an error. But it’s not hidden away as something that already exists. It’s something that could be computed from us, but which we don’t explicitly represent.
Hence “deep in the mind”, not brain: defined in a model, not explicitly represented. Although there is more preference-defining stuff outside the mind (or rather outside the brain...).
To be honest, I wasn’t thinking of the the distinction between mind and brain when I wrote that, so Eliezer’s correction is on target. I was visualizing the utility function as something that exists and must be discovered.
No. It’s more that if you extrapolate out the preferences we already have, asking what we would prefer if we had time for our chaotic preferences to resolve themselves, then you end up with a superior sort of shouldness to which our present preferences might well defer. Sort of like if you knew that your future self would be a vegetarian, you might regard your present consumption of meat as an error. But it’s not hidden away as something that already exists. It’s something that could be computed from us, but which we don’t explicitly represent.
Hence “deep in the mind”, not brain: defined in a model, not explicitly represented. Although there is more preference-defining stuff outside the mind (or rather outside the brain...).
To be honest, I wasn’t thinking of the the distinction between mind and brain when I wrote that, so Eliezer’s correction is on target. I was visualizing the utility function as something that exists and must be discovered.