“Goal-oriented behavior” is actually pretty complicated, and is not, in fact, a natural byproduct of general AI. I think the kinds of tasks we currently employ computers to do are hiding a lot of this complexity.
Specifically, what we think of as artificial intelligence is distinct from motivational intelligence is distinct from goal-oriented behaviors. Creating an AI that can successfully play any video game is an entirely different technology stack from creating an AI that “wants” to play video games, which in turn is an entirely different technology stack from creating an AI that translates a “desire” to play video games into a sequence of behaviors which can actually do so.
The AI alignment issue is noticing that good motivation is hard to get right; I think this needs to be “motivation is going to be hard to do at all, good or bad”; possibly harder than intelligence itself. Part of the problem with AI writing right now is that the writing is, basically, unintentional. You can get a lot further with unintentional writing, but intentional writing is far beyond anything that exists right now. I think a lot of fears come about because of a belief that motivation can arise spontaneously, or that intentionality can arise out of the programming itself; that we might write our desires into machines such that machines will know desire.
What would it take for GPT-3 to want to run itself? I don’t think we have a handle on that question at all.
Goal-oriented behaviors, meanwhile, correspond to an interaction between motivation and intelligence that itself is immensely more complicated than either independently.
---
I think part of the issue here is that, if you ask why a computer does something, the answer is “Because it was programmed to.” So, to make a program do something, you just program it to do it. Except this is moving the motivation, and intentionality, to the programmer; or, alternatively, to the person pressing the button causing the AI to act.
The AI in a computer game does what it does, because it is a program that is running, that causes things to happen. If it’s a first person shooter, the AI is trying to kill the player. The AI has no notion of killing the player, however; it doesn’t know what it is trying to do, it is just a series of instructions, which are, if you think about it, a set of heuristics that the programmer developed to kill the player.
This doesn’t change if it’s a neural network. AlphaGo is not, in fact, trying to win a game of Go; it is the humans who trained it who have any motivation, AlphaGo itself is just a set of really good heuristics. No matter how good you make those heuristics, AlphaGo will never start trying to win a game, because the idea of the heuristics in question trying to win a game is a category error.
I think, when people make the mental leap from “AI we have now” to “general AI”, they’re underspecifying what it is they are actually thinking about.
AI that can solve a specific, well-defined problem.
AI that can solve a well-defined problem. ← This is general AI; a set of universal problem-solving heuristics satisfies this criteria.
AI that can solve a poorly-defined problem. ← This, I think, is what people are afraid of, for fear that somebody will give it a problem, ask it to solve it, and it will tile the universe in paperclips.
“Goal-oriented behavior” is actually pretty complicated, and is not, in fact, a natural byproduct of general AI. I think the kinds of tasks we currently employ computers to do are hiding a lot of this complexity.
Specifically, what we think of as artificial intelligence is distinct from motivational intelligence is distinct from goal-oriented behaviors. Creating an AI that can successfully play any video game is an entirely different technology stack from creating an AI that “wants” to play video games, which in turn is an entirely different technology stack from creating an AI that translates a “desire” to play video games into a sequence of behaviors which can actually do so.
The AI alignment issue is noticing that good motivation is hard to get right; I think this needs to be “motivation is going to be hard to do at all, good or bad”; possibly harder than intelligence itself. Part of the problem with AI writing right now is that the writing is, basically, unintentional. You can get a lot further with unintentional writing, but intentional writing is far beyond anything that exists right now. I think a lot of fears come about because of a belief that motivation can arise spontaneously, or that intentionality can arise out of the programming itself; that we might write our desires into machines such that machines will know desire.
What would it take for GPT-3 to want to run itself? I don’t think we have a handle on that question at all.
Goal-oriented behaviors, meanwhile, correspond to an interaction between motivation and intelligence that itself is immensely more complicated than either independently.
---
I think part of the issue here is that, if you ask why a computer does something, the answer is “Because it was programmed to.” So, to make a program do something, you just program it to do it. Except this is moving the motivation, and intentionality, to the programmer; or, alternatively, to the person pressing the button causing the AI to act.
The AI in a computer game does what it does, because it is a program that is running, that causes things to happen. If it’s a first person shooter, the AI is trying to kill the player. The AI has no notion of killing the player, however; it doesn’t know what it is trying to do, it is just a series of instructions, which are, if you think about it, a set of heuristics that the programmer developed to kill the player.
This doesn’t change if it’s a neural network. AlphaGo is not, in fact, trying to win a game of Go; it is the humans who trained it who have any motivation, AlphaGo itself is just a set of really good heuristics. No matter how good you make those heuristics, AlphaGo will never start trying to win a game, because the idea of the heuristics in question trying to win a game is a category error.
I think, when people make the mental leap from “AI we have now” to “general AI”, they’re underspecifying what it is they are actually thinking about.
AI that can solve a specific, well-defined problem.
AI that can solve a well-defined problem. ← This is general AI; a set of universal problem-solving heuristics satisfies this criteria.
AI that can solve a poorly-defined problem. ← This, I think, is what people are afraid of, for fear that somebody will give it a problem, ask it to solve it, and it will tile the universe in paperclips.