There is no (actionable information about) a preference until it’s defined. What I currently legibly want is certainly not directly about the way I prefer the world to be. The process of defining it is some kind of computation. This computation could be more like the process of living a life, contemplating culture and agency, than a nonperson-prediction performed by an AI. This computation could itself have moral significance similar to that of a living person.
More to the point, defining this process should be within your authority, and shaping it as a process of living a life seems like a good start, to have the time to come up with further ideas and to learn relevant things. It shouldn’t be legitimate to proclaim that your preference is something that would significantly disagree with whatever process of reflection you would define yourself. Thus even a superintelligence would need to predict the outcome of a process you would define yourself, or else it wouldn’t really be your preference. And predicting long term outcomes of living a very long life could be impossible other than by essentially simulating it in detail, so that it wouldn’t be possible to make that prediction without giving the process of making the prediction moral significance equivalent to that of living that life more concretely.
If you would like to interact with other people during this process, you get a whole society that needs to be part of the process. At this point, it’s unclear if there is any point at all in the abstraction of defining a somewhat comprehensive preference, or in the purpose of this activity being about formulation of preference. The salient question becomes how to structure that society well, based on much less detailed considerations.
Sorry, I have to admit I didn’t really understand that.
What I currently legibly want is certainly not directly about the way I prefer the world to be.
What do you mean by this? Do you mean that your preferences are defined in terms of your experiences and not the external world? Or do you mean that you don’t really have coherent object-level preferences about many things, but still have some meta-level preference that is hard to define, or defined by a process, the outcome of which would be hard for an ASI to compute? Or some other thing?
Not disagreeing with anything, just trying to understand.
Whatever I’m currently ready to judge about the way I would like the world to be is not my real preference, because I endorse a process of figuring out a better more considered judgement whose outcomes are not yet decided (and could well be different from any current judgement). And the process of deciding these outcomes could look much like living a life (or many lives), at least initially while setting up anything more elaborate. Even a superintelligence probably can’t find useful shortcuts for such a process, without breaking legitimacy of its findings.
The thread is about preference, perhaps utility functions, so it’s not about concrete wishes. A utility function is data for consistently comparing events, certain subsets of a sample space, by assigning them an expected utility. Long reflection then is a process for producing this data. Rather than something this data is ex ante supposed to be about, the process is the instrumental means of producing the preference about in general something else.
My argument was making two points: that any immediate wants are not the preference, and that the process defining the preference would itself have moral significance. So there is a circularity to this construction, defining preference benefits from already having some access to it, to better structure the process that defines it. Resolving this circularity is a step that could benefit from efforts of a superintelligence. But the process still retains the character of lived experience rather than clinical calculation, so speculations about better structures of hypothetical societies remain relevant for defining extrapolated volition.
My argument was making two points: that any immediate wants are not the preference [...]
Right now you’d want the ASI to maximize your preferences, even though those preferences are not yet legible/knowable to the AI (or yourself). The AI knows this, so it will allow those preferences to develop (taking for granted that that’s the only way the AI can learn them, without violating some other things you currently want, like the moral worth of the simulated entities a potential attempt at shortcutting the unfolding might create)
Like, right now you have wants the define your preferences (not object level, but define the process that would lead to your preferences being developed, and which constraints that unfolding needs to be subject to). If the AI optimizes for this, it will lead to the preferences being optimized for later.
And it will be able to do this, because this is what you currently want, and the premise is that we can get the AI to do what you want.
There is no (actionable information about) a preference until it’s defined. What I currently legibly want is certainly not directly about the way I prefer the world to be. The process of defining it is some kind of computation. This computation could be more like the process of living a life, contemplating culture and agency, than a nonperson-prediction performed by an AI. This computation could itself have moral significance similar to that of a living person.
More to the point, defining this process should be within your authority, and shaping it as a process of living a life seems like a good start, to have the time to come up with further ideas and to learn relevant things. It shouldn’t be legitimate to proclaim that your preference is something that would significantly disagree with whatever process of reflection you would define yourself. Thus even a superintelligence would need to predict the outcome of a process you would define yourself, or else it wouldn’t really be your preference. And predicting long term outcomes of living a very long life could be impossible other than by essentially simulating it in detail, so that it wouldn’t be possible to make that prediction without giving the process of making the prediction moral significance equivalent to that of living that life more concretely.
If you would like to interact with other people during this process, you get a whole society that needs to be part of the process. At this point, it’s unclear if there is any point at all in the abstraction of defining a somewhat comprehensive preference, or in the purpose of this activity being about formulation of preference. The salient question becomes how to structure that society well, based on much less detailed considerations.
Sorry, I have to admit I didn’t really understand that.
What do you mean by this? Do you mean that your preferences are defined in terms of your experiences and not the external world? Or do you mean that you don’t really have coherent object-level preferences about many things, but still have some meta-level preference that is hard to define, or defined by a process, the outcome of which would be hard for an ASI to compute? Or some other thing?
Not disagreeing with anything, just trying to understand.
Whatever I’m currently ready to judge about the way I would like the world to be is not my real preference, because I endorse a process of figuring out a better more considered judgement whose outcomes are not yet decided (and could well be different from any current judgement). And the process of deciding these outcomes could look much like living a life (or many lives), at least initially while setting up anything more elaborate. Even a superintelligence probably can’t find useful shortcuts for such a process, without breaking legitimacy of its findings.
I don’t quite see your point. If this is genuinely what you want, the AI would allow that process to unfold.
The thread is about preference, perhaps utility functions, so it’s not about concrete wishes. A utility function is data for consistently comparing events, certain subsets of a sample space, by assigning them an expected utility. Long reflection then is a process for producing this data. Rather than something this data is ex ante supposed to be about, the process is the instrumental means of producing the preference about in general something else.
My argument was making two points: that any immediate wants are not the preference, and that the process defining the preference would itself have moral significance. So there is a circularity to this construction, defining preference benefits from already having some access to it, to better structure the process that defines it. Resolving this circularity is a step that could benefit from efforts of a superintelligence. But the process still retains the character of lived experience rather than clinical calculation, so speculations about better structures of hypothetical societies remain relevant for defining extrapolated volition.
I don’t think this makes sense.
Right now you’d want the ASI to maximize your preferences, even though those preferences are not yet legible/knowable to the AI (or yourself). The AI knows this, so it will allow those preferences to develop (taking for granted that that’s the only way the AI can learn them, without violating some other things you currently want, like the moral worth of the simulated entities a potential attempt at shortcutting the unfolding might create)
Like, right now you have wants the define your preferences (not object level, but define the process that would lead to your preferences being developed, and which constraints that unfolding needs to be subject to). If the AI optimizes for this, it will lead to the preferences being optimized for later.
And it will be able to do this, because this is what you currently want, and the premise is that we can get the AI to do what you want.