You’re talking about an ontological crisis, though Aribital has a slightly different term. Naturally people have started to work on the problem and MIRI believes it can be solved.
It also seems like the exact same issue arises with satisficing, and you’ve hidden this fact by talking about “some abstract notion of good and bad” without explaining how the AGI will relate this notion to the world (or distribution) that it thinks it exists in.
It also seems like the exact same issue arises with satisficing, and you’ve hidden this fact by talking about “some abstract notion of good and bad” without explaining how the AGI will relate this notion to the world (or distribution) that it thinks it exists in.
Satisficers can use credit assignment strategies of various means without always falling to wireheading, because they are not trying to maximise credit. I’m interested in more flexible schemes that can paradigm shift, but the basic idea from Reinforcement Learning seems sound.
For one, they wouldn’t find a single example of a solution. They wouldn’t see any fscking human beings maintaining any goal not defined in terms of their own perceptions—eg, making others happy, having an historical artifact, or visiting a place where some event actually happened—despite changing their understanding of our world’s fundamental reality.
If I try to interpret the rest of your response charitably, it looks like you’re saying the AGI can have goals wholly defined in terms of perception, because it can avoid wireheading via satisficing. That seems incompatible with what you said before, which again invoked “some abstract notion of good and bad” rather than sensory data. So I have to wonder if you understand anything I’m saying, or if you’re conflating ontological crises with some less important “paradigm shift”—something, at least, that you have made no case for caring about.
For one, they wouldn’t find a single example of a solution. They wouldn’t see any fscking human beings maintaining any goal not defined in terms of their own perceptions—eg, making others happy, having an historical artifact, or visiting a place where some event actually happened—despite changing their understanding of our world’s fundamental reality.
Fscking humans aren’t examples of maximizers with coherent ontologies changing in them in a way that will guarantee that the goals will be followed. They’re examples of systems with multiple different languages for describing the world that exist simultaneously. The multiple languages sometimes come into conflict and people sometimes go insane. People are only generally maximis-ish in certain domains and that maximisation is not constant over people’s life times.
If I try to interpret the rest of your response charitably, it looks like you’re saying the AGI can have goals wholly defined in terms of perception, because it can avoid wireheading via satisficing. That seems incompatible with what you said before, which again invoked “some abstract notion of good and bad” rather than sensory data
You can hard code some sensory data to mean an abstract notion of good or bad, if you know you have a helpful human around to supply that sensory data and keep that meaning.
ontological crises with some less important “paradigm shift”
Paradigm shifts are ontological crises limited to the language used to describe a domain. You can go read about them on wikipedia if you want and make up your own mind if they are important.
You’re talking about an ontological crisis, though Aribital has a slightly different term. Naturally people have started to work on the problem and MIRI believes it can be solved.
It also seems like the exact same issue arises with satisficing, and you’ve hidden this fact by talking about “some abstract notion of good and bad” without explaining how the AGI will relate this notion to the world (or distribution) that it thinks it exists in.
If it can’t be solved, how will MIRI know?
Satisficers can use credit assignment strategies of various means without always falling to wireheading, because they are not trying to maximise credit. I’m interested in more flexible schemes that can paradigm shift, but the basic idea from Reinforcement Learning seems sound.
For one, they wouldn’t find a single example of a solution. They wouldn’t see any fscking human beings maintaining any goal not defined in terms of their own perceptions—eg, making others happy, having an historical artifact, or visiting a place where some event actually happened—despite changing their understanding of our world’s fundamental reality.
If I try to interpret the rest of your response charitably, it looks like you’re saying the AGI can have goals wholly defined in terms of perception, because it can avoid wireheading via satisficing. That seems incompatible with what you said before, which again invoked “some abstract notion of good and bad” rather than sensory data. So I have to wonder if you understand anything I’m saying, or if you’re conflating ontological crises with some less important “paradigm shift”—something, at least, that you have made no case for caring about.
Fscking humans aren’t examples of maximizers with coherent ontologies changing in them in a way that will guarantee that the goals will be followed. They’re examples of systems with multiple different languages for describing the world that exist simultaneously. The multiple languages sometimes come into conflict and people sometimes go insane. People are only generally maximis-ish in certain domains and that maximisation is not constant over people’s life times.
You can hard code some sensory data to mean an abstract notion of good or bad, if you know you have a helpful human around to supply that sensory data and keep that meaning.
Paradigm shifts are ontological crises limited to the language used to describe a domain. You can go read about them on wikipedia if you want and make up your own mind if they are important.