My idea tries to make that mismatch small, by making the goal say directly which conscious experiences should exist
You think you can formalize a goal which specifies which conscious experiences should exist? It looks to me to be equivalent to formalizing the human value system. And being isomorphic to a “set of unmodified human brains” just gives you the whole humanity as it is: some people’s fantasies involve rainbows and unicorns, and some—pain and domination. There are people who do want hells, virtual or not—so you either will have them in your utopia or you will have to filter such desires out and that involves a value system to decide what’s acceptable in the utopia and what’s not.
My idea was to make humans set all the rules, while defining the VR utopia, before even starting the AI.
That’s called politics and is equivalent to setting the rules for real-life societies on real-life Earth. I don’t see why you would expect it to go noticeably better this time around—you’re still deciding on rules for reality, just with a detour through VR. And how would that work in practice? A UN committee or something? How will disagreements be resolved?
just give people some control buttons that other people can’t take away
To take a trivial example, consider internet harassment. Everyone has a control button that online trolls cannot take away: the off switch on your computer (or even the little X in the top corner of your window). You think it works that well?
You think you can formalize a goal which specifies which conscious experiences should exist? It looks to me to be equivalent to formalizing the human value system.
The hope is that encoding the idea of consciousness will be strictly easier than encoding everything that humans value, including the idea of consciousness (and pleasure, pain, love, population ethics, etc). It’s an assumption of the post.
That’s called politics and is equivalent to setting the rules for real-life societies on real-life Earth.
Correct. My idea doesn’t aim to solve all human problems forever. It aims to solve the problem that right now we’re sitting on a powder keg, with many ways for smarter than human intelligences to emerge, most of which kill everyone. Once we’ve resolved that danger, we can take our time to solve things like politics, internet harassment, or reconciling people’s fantasies.
I agree that defining the VR is itself a political problem, though. Maybe we should do it with a UN committee! It’s a human-scale decision, and even if we get it wrong and a bunch of people suffer, that might be still preferable to killing everyone.
Once we’ve resolved that danger, we can take our time to solve things
I don’t know—I think that once you hand off the formalized goal to the UFAI, you’re stuck: you snapshotted the desired state and you can’t change anything any more. If you can change things, well, that UFAI will make sure things will get changed in the direction it wants.
I think it should be possible to define a game that gives people tools to peacefully resolve disagreements, without giving them tools for intelligence explosion. The two don’t seem obviously connected.
So then, basically, the core of your idea is to move all humans to a controlled reality (first VR, then physical) where an intelligence explosion is impossible? It’s not really supposed to solve any problems, just prevent the expected self-destruction?
Yeah. At quite high cost, too. Like I said, it’s intended as a lower bound of what’s achievable, and I wouldn’t have posted it if any better lower bound was known.
You think you can formalize a goal which specifies which conscious experiences should exist? It looks to me to be equivalent to formalizing the human value system. And being isomorphic to a “set of unmodified human brains” just gives you the whole humanity as it is: some people’s fantasies involve rainbows and unicorns, and some—pain and domination. There are people who do want hells, virtual or not—so you either will have them in your utopia or you will have to filter such desires out and that involves a value system to decide what’s acceptable in the utopia and what’s not.
That’s called politics and is equivalent to setting the rules for real-life societies on real-life Earth. I don’t see why you would expect it to go noticeably better this time around—you’re still deciding on rules for reality, just with a detour through VR. And how would that work in practice? A UN committee or something? How will disagreements be resolved?
To take a trivial example, consider internet harassment. Everyone has a control button that online trolls cannot take away: the off switch on your computer (or even the little X in the top corner of your window). You think it works that well?
The hope is that encoding the idea of consciousness will be strictly easier than encoding everything that humans value, including the idea of consciousness (and pleasure, pain, love, population ethics, etc). It’s an assumption of the post.
Correct. My idea doesn’t aim to solve all human problems forever. It aims to solve the problem that right now we’re sitting on a powder keg, with many ways for smarter than human intelligences to emerge, most of which kill everyone. Once we’ve resolved that danger, we can take our time to solve things like politics, internet harassment, or reconciling people’s fantasies.
I agree that defining the VR is itself a political problem, though. Maybe we should do it with a UN committee! It’s a human-scale decision, and even if we get it wrong and a bunch of people suffer, that might be still preferable to killing everyone.
I don’t know—I think that once you hand off the formalized goal to the UFAI, you’re stuck: you snapshotted the desired state and you can’t change anything any more. If you can change things, well, that UFAI will make sure things will get changed in the direction it wants.
I think it should be possible to define a game that gives people tools to peacefully resolve disagreements, without giving them tools for intelligence explosion. The two don’t seem obviously connected.
So then, basically, the core of your idea is to move all humans to a controlled reality (first VR, then physical) where an intelligence explosion is impossible? It’s not really supposed to solve any problems, just prevent the expected self-destruction?
Yeah. At quite high cost, too. Like I said, it’s intended as a lower bound of what’s achievable, and I wouldn’t have posted it if any better lower bound was known.