Your edit pretty much captures my point, yes :) If nothing else, a Weak Friendly AI should eliminate a ton of the trivial distractions like war and famine, and I’d expect that humans have a much more unified volition when we’re not constantly worried about scarcity and violence. There’s not a lot of current political problems I’d see being relevant in a post-AI, post-scarcity, post-violence world.
The problem is that we have to guarantee that the AI doesn’t do something really bad while trying to stop these problems; what if it decides it really needs more resources suddenly, or needs to spy on everyone, even briefly? And it seems (to me at least) that stopping it from having bad side effects is pretty close, if not equivalent to, Strong Friendliness.
I should have made that more clear: I still think Weak-Friendliness is a very difficult problem. My point is simply that we only need an AI that solves the big problems, not an AI that can do our taxes. My second point was that humans seem to already implement weak-friendliness, barring a few historical exceptions, whereas so far we’ve completely failed at implementing strong-friendliness.
I’m using Weak vs Strong here in the sense of Weak being a “SysOP” style AI that just handles catastrophes, whereas Strong is the “ushers in the Singularity” sort that usually gets talked about here, and can do your taxes :)
Wait, if you’re regarding the elimination of war, famine and disease as consolation prizes for creating an wFAI, what are people expecting from a sFAI?
Really. That really is what people are expecting of a strong FAI. Compared with us, it will be omniscient, omnipotent, and omnibenevolent. Unlike currently believed-in Gods, there will be no problem of evil because it will remove all evil from the world. It will do what the Epicurean argument demands of any God worthy of the name.
Well, I don’t take seriously any of these speculations about God-like vs. merely angel-like creations. They’re just a distraction from the task of actually building them, which no-one knows how to do anyway.
Because we have no idea how hard it is to specify either. If, along the way it turns out to be easy to specify wFAI and risky to specify sFAI, then the reasonable course is expected. Doubly so since a wFAI would almost certainly be useful in helping specify a sFAI.
Seeing as human values are a miniscule target, it seems probable that specifying wFAI is harder than sFAI though.
Why would it be harder? One could tell the wFAI improve factors that are strongly correlated with human values, such as food stability, resources that cure preventable diseases (such as diarrhea, which, as we know, kills way more people than it should) and security from natural disasters.
“I wish for a list of possibilities for sequences of actions, any of whose execution would satisfy the following conditions.
Within twenty years, for Nigeria to have standards of living such that it would receive the same rating as Finland on [Placeholder UN Scale of People’s-Lives-Not-Being-Awful].”
The course of action would be evaluated by a think-tank, until they decided that the course of actions was acceptable, and the wFAI was given the go.
The AI optimizes only for that and doesn’t generate a list of non-obvious side effects. You implement one of them and something horrible happens to finland, and or countries beside nigeria.
or
In order to generate said list I simulate Nigeria millions of times to a resolution such that entities within the simulation pass the turing test. Most of the simulations involve horrible outcomes for all involved.
or
I generate such a list including many sequences of actions that lead to a small group being able to take over nigeria and or finland and or the world. (or generates some other power differential that screws up international relations)
or
In order to execute such an action I need more computing power, and you forgot to specify what are acceptable actions for obtaining it.
or
The wFAI is much cleverer than a single human thinking about this for 2 minutes and can screw things up in ways that are as opaque to you as human actions are to a dog.
Even more generally, our ability to build an AI that is friendly will have nothing to do with our ability to generate clauses in english that sound reasonable.
Your edit pretty much captures my point, yes :) If nothing else, a Weak Friendly AI should eliminate a ton of the trivial distractions like war and famine, and I’d expect that humans have a much more unified volition when we’re not constantly worried about scarcity and violence. There’s not a lot of current political problems I’d see being relevant in a post-AI, post-scarcity, post-violence world.
The problem is that we have to guarantee that the AI doesn’t do something really bad while trying to stop these problems; what if it decides it really needs more resources suddenly, or needs to spy on everyone, even briefly? And it seems (to me at least) that stopping it from having bad side effects is pretty close, if not equivalent to, Strong Friendliness.
I should have made that more clear: I still think Weak-Friendliness is a very difficult problem. My point is simply that we only need an AI that solves the big problems, not an AI that can do our taxes. My second point was that humans seem to already implement weak-friendliness, barring a few historical exceptions, whereas so far we’ve completely failed at implementing strong-friendliness.
I’m using Weak vs Strong here in the sense of Weak being a “SysOP” style AI that just handles catastrophes, whereas Strong is the “ushers in the Singularity” sort that usually gets talked about here, and can do your taxes :)
This… may be an amazing idea. I’m noodling on it.
Edit: Completely misread the parent.
I know this wasn’t the spirit of your post, but I wouldn’t refer to war and famine as “trivial distractions”.
Wait, if you’re regarding the elimination of war, famine and disease as consolation prizes for creating an wFAI, what are people expecting from a sFAI?
God. Either with or without the ability to bend the currently known laws of physics.
No, really.
Really. That really is what people are expecting of a strong FAI. Compared with us, it will be omniscient, omnipotent, and omnibenevolent. Unlike currently believed-in Gods, there will be no problem of evil because it will remove all evil from the world. It will do what the Epicurean argument demands of any God worthy of the name.
Are you telling me that if a wFAI were capable of eliminating war, famine and disease, it wouldn’t be developed first?
Well, I don’t take seriously any of these speculations about God-like vs. merely angel-like creations. They’re just a distraction from the task of actually building them, which no-one knows how to do anyway.
But still, if a wFAI was capable of eliminating those things, why be picky and try for sFAI?
Because we have no idea how hard it is to specify either. If, along the way it turns out to be easy to specify wFAI and risky to specify sFAI, then the reasonable course is expected. Doubly so since a wFAI would almost certainly be useful in helping specify a sFAI.
Seeing as human values are a miniscule target, it seems probable that specifying wFAI is harder than sFAI though.
“Specify”? What do you mean?
specifications a la programming.
Why would it be harder? One could tell the wFAI improve factors that are strongly correlated with human values, such as food stability, resources that cure preventable diseases (such as diarrhea, which, as we know, kills way more people than it should) and security from natural disasters.
Because if you screw up specifying human values you don’t get wFAI you just die (hopefully).
It’s not optimizing human values, it’s optimizing circumstances that are strongly correlated with human values. It would be a logistics kind of thing.
Have you ever played corrupt a wish?
No, but I’m guessing I’m about to.
“I wish for a list of possibilities for sequences of actions, any of whose execution would satisfy the following conditions.
Within twenty years, for Nigeria to have standards of living such that it would receive the same rating as Finland on [Placeholder UN Scale of People’s-Lives-Not-Being-Awful].”
The course of action would be evaluated by a think-tank, until they decided that the course of actions was acceptable, and the wFAI was given the go.
The AI optimizes only for that and doesn’t generate a list of non-obvious side effects. You implement one of them and something horrible happens to finland, and or countries beside nigeria.
or
In order to generate said list I simulate Nigeria millions of times to a resolution such that entities within the simulation pass the turing test. Most of the simulations involve horrible outcomes for all involved.
or
I generate such a list including many sequences of actions that lead to a small group being able to take over nigeria and or finland and or the world. (or generates some other power differential that screws up international relations)
or
In order to execute such an action I need more computing power, and you forgot to specify what are acceptable actions for obtaining it.
or
The wFAI is much cleverer than a single human thinking about this for 2 minutes and can screw things up in ways that are as opaque to you as human actions are to a dog.
In general, specifying an oracle/tool AI is not safe: http://lesswrong.com/lw/cze/reply_to_holden_on_tool_ai/
Even more generally, our ability to build an AI that is friendly will have nothing to do with our ability to generate clauses in english that sound reasonable.