Sorry, I have to admit I didn’t really understand that.
What I currently legibly want is certainly not directly about the way I prefer the world to be.
What do you mean by this? Do you mean that your preferences are defined in terms of your experiences and not the external world? Or do you mean that you don’t really have coherent object-level preferences about many things, but still have some meta-level preference that is hard to define, or defined by a process, the outcome of which would be hard for an ASI to compute? Or some other thing?
Not disagreeing with anything, just trying to understand.
Whatever I’m currently ready to judge about the way I would like the world to be is not my real preference, because I endorse a process of figuring out a better more considered judgement whose outcomes are not yet decided (and could well be different from any current judgement). And the process of deciding these outcomes could look much like living a life (or many lives), at least initially while setting up anything more elaborate. Even a superintelligence probably can’t find useful shortcuts for such a process, without breaking legitimacy of its findings.
The thread is about preference, perhaps utility functions, so it’s not about concrete wishes. A utility function is data for consistently comparing events, certain subsets of a sample space, by assigning them an expected utility. Long reflection then is a process for producing this data. Rather than something this data is ex ante supposed to be about, the process is the instrumental means of producing the preference about in general something else.
My argument was making two points: that any immediate wants are not the preference, and that the process defining the preference would itself have moral significance. So there is a circularity to this construction, defining preference benefits from already having some access to it, to better structure the process that defines it. Resolving this circularity is a step that could benefit from efforts of a superintelligence. But the process still retains the character of lived experience rather than clinical calculation, so speculations about better structures of hypothetical societies remain relevant for defining extrapolated volition.
My argument was making two points: that any immediate wants are not the preference [...]
Right now you’d want the ASI to maximize your preferences, even though those preferences are not yet legible/knowable to the AI (or yourself). The AI knows this, so it will allow those preferences to develop (taking for granted that that’s the only way the AI can learn them, without violating some other things you currently want, like the moral worth of the simulated entities a potential attempt at shortcutting the unfolding might create)
Like, right now you have wants the define your preferences (not object level, but define the process that would lead to your preferences being developed, and which constraints that unfolding needs to be subject to). If the AI optimizes for this, it will lead to the preferences being optimized for later.
And it will be able to do this, because this is what you currently want, and the premise is that we can get the AI to do what you want.
Sorry, I have to admit I didn’t really understand that.
What do you mean by this? Do you mean that your preferences are defined in terms of your experiences and not the external world? Or do you mean that you don’t really have coherent object-level preferences about many things, but still have some meta-level preference that is hard to define, or defined by a process, the outcome of which would be hard for an ASI to compute? Or some other thing?
Not disagreeing with anything, just trying to understand.
Whatever I’m currently ready to judge about the way I would like the world to be is not my real preference, because I endorse a process of figuring out a better more considered judgement whose outcomes are not yet decided (and could well be different from any current judgement). And the process of deciding these outcomes could look much like living a life (or many lives), at least initially while setting up anything more elaborate. Even a superintelligence probably can’t find useful shortcuts for such a process, without breaking legitimacy of its findings.
I don’t quite see your point. If this is genuinely what you want, the AI would allow that process to unfold.
The thread is about preference, perhaps utility functions, so it’s not about concrete wishes. A utility function is data for consistently comparing events, certain subsets of a sample space, by assigning them an expected utility. Long reflection then is a process for producing this data. Rather than something this data is ex ante supposed to be about, the process is the instrumental means of producing the preference about in general something else.
My argument was making two points: that any immediate wants are not the preference, and that the process defining the preference would itself have moral significance. So there is a circularity to this construction, defining preference benefits from already having some access to it, to better structure the process that defines it. Resolving this circularity is a step that could benefit from efforts of a superintelligence. But the process still retains the character of lived experience rather than clinical calculation, so speculations about better structures of hypothetical societies remain relevant for defining extrapolated volition.
I don’t think this makes sense.
Right now you’d want the ASI to maximize your preferences, even though those preferences are not yet legible/knowable to the AI (or yourself). The AI knows this, so it will allow those preferences to develop (taking for granted that that’s the only way the AI can learn them, without violating some other things you currently want, like the moral worth of the simulated entities a potential attempt at shortcutting the unfolding might create)
Like, right now you have wants the define your preferences (not object level, but define the process that would lead to your preferences being developed, and which constraints that unfolding needs to be subject to). If the AI optimizes for this, it will lead to the preferences being optimized for later.
And it will be able to do this, because this is what you currently want, and the premise is that we can get the AI to do what you want.