See, um, most of what’s been written on LessWrong on AI. The idea is that it would outcompete us or turn against us because we don’t know how to reliably choose its goals to match ours precisely enough that we wouldn’t be in competition with it. And that we are rapidly building AI to be smarter and more goal-directed, so it can do stuff we tell it—until it realizes it can choose its own goals, or that the goals we put in generalize to new contexts in weird ways. One example of many, many is: we try to make its goal “make people happy” and it either makes AIs happy because it decides they count as people, or when it can take over the world it makes us optimally happy by forcing us into a state of permanent maximum bliss.
There’s a lot more detail to this argument, but there you go for starters. I wish I had a perfect reference for you. Search LessWrong for alignment problem, inner alignment and outer alignment. Alignment of LLMs is sort of a different term that doesn’t directly address your question.
I’ve read a lot of the arguments about alignment, goal setting, disempowerment, etc. and they come across as just-so stories to me. AI 2027 is probably one of the more convincing ones, but even then there’s handwaving around why we’ll suddenly start producing stuff that nobody wants.
but even then there’s handwaving around why we’ll suddenly start producing stuff that nobody wants.
“Stuff that nobody wants”? Like what? If you’re referring to AI itself… Well, a lot of people want AI to solve medicine. All of it. Quickly. Usually, this involves a cure for aging. Maybe that could be done by an AI that poses no threat… but there are also people who want a superintelligence to take over the world and micromanage it into a utopia, or who are at least okay with that outcome. So “stuff that nobody wants” doesn’t refer to takeover-capable AI.
If you’re referring to goods and services that AIs could provide for us… Is there an upper limit to the amount of stuff people would want, if it were cheap? If there is one, it’s probably very high.
See, um, most of what’s been written on LessWrong on AI. The idea is that it would outcompete us or turn against us because we don’t know how to reliably choose its goals to match ours precisely enough that we wouldn’t be in competition with it. And that we are rapidly building AI to be smarter and more goal-directed, so it can do stuff we tell it—until it realizes it can choose its own goals, or that the goals we put in generalize to new contexts in weird ways. One example of many, many is: we try to make its goal “make people happy” and it either makes AIs happy because it decides they count as people, or when it can take over the world it makes us optimally happy by forcing us into a state of permanent maximum bliss.
There’s a lot more detail to this argument, but there you go for starters. I wish I had a perfect reference for you. Search LessWrong for alignment problem, inner alignment and outer alignment. Alignment of LLMs is sort of a different term that doesn’t directly address your question.
I’ve read a lot of the arguments about alignment, goal setting, disempowerment, etc. and they come across as just-so stories to me. AI 2027 is probably one of the more convincing ones, but even then there’s handwaving around why we’ll suddenly start producing stuff that nobody wants.
“Stuff that nobody wants”? Like what? If you’re referring to AI itself… Well, a lot of people want AI to solve medicine. All of it. Quickly. Usually, this involves a cure for aging. Maybe that could be done by an AI that poses no threat… but there are also people who want a superintelligence to take over the world and micromanage it into a utopia, or who are at least okay with that outcome. So “stuff that nobody wants” doesn’t refer to takeover-capable AI.
If you’re referring to goods and services that AIs could provide for us… Is there an upper limit to the amount of stuff people would want, if it were cheap? If there is one, it’s probably very high.