I only skimmed this to get the basics, I guess I’ll read it more carefully and responsibly later. But my immediate impressions: The narrative presents a near future history of AI agents, which largely recapitulates the recent past experience with our current AIs. Then we linger on the threshold of superintelligence, as one super-AI designs another which designs another which… It seemed artificially drawn out. Then superintelligence arrives, and one of two things happens: We get a world in which human beings are still living human lives, but surrounded by abundance and space travel, and superintelligent AIs are in the background doing philosophy at a thousand times human speed or something. Or, the AIs put all organic life into indefinite data storage, and set out to conquer the universe themselves.
I find this choice of scenarios unsatisfactory. For one thing, I think the idea of explosive conquest of the universe once a certain threshold is passed (whether or not humans are in the loop) has too strong a hold on people’s imaginations. I understand the logic of it, but it’s a stereotyped scenario now.
Also, I just don’t buy this idea of “life goes on, but with robots and space colonies”. Somewhere I noticed a passage about superintelligence being released to the public, as if it was an app. Even if you managed to create this Culture-like scenario, in which anyone can ask for anything from a ubiquitous superintelligence but it makes sure not to fulfil wishes that are damaging in some way… you are then definitely in a world in which superintelligence is running things. I don’t believe in an elite human minority who have superintelligence in a bottle and then get to dole it out. Once you create superintelligence, it’s in charge. Even if it’s benevolent, humans and humans life are not likely to go on unchanged, there is too much that humans can hope for that would change them and their world beyond recognition.
Anyway, that’s my impulsive first reaction, eventually I’ll do a more sober and studied response…
I understand these two paths simply as: - a scenario of aligned AI - a scenario of not aligned AI
The aligned AI by definition is a machine whose values (~will) is similar to the values of humans. If this is the case, then if people want something, then the AI wants it too. If people want to be agentic, then they are agentic—because the AI wants it and allows them for that.
In the second scenario people become irrelevant. They get wiped out. The machine then proceeds with the realisation of its desires. The desires are what people had injected in it. In this prediction the desires and values are: - scientific/AI research—coming from the agency properties (LLM in a for loop?) - making impression of somebody friendly—coming from the RLHF-like techniques in which the output of the LLM has to be accepted by various people and people-made criteria.
I only skimmed this to get the basics, I guess I’ll read it more carefully and responsibly later. But my immediate impressions: The narrative presents a near future history of AI agents, which largely recapitulates the recent past experience with our current AIs. Then we linger on the threshold of superintelligence, as one super-AI designs another which designs another which… It seemed artificially drawn out. Then superintelligence arrives, and one of two things happens: We get a world in which human beings are still living human lives, but surrounded by abundance and space travel, and superintelligent AIs are in the background doing philosophy at a thousand times human speed or something. Or, the AIs put all organic life into indefinite data storage, and set out to conquer the universe themselves.
I find this choice of scenarios unsatisfactory. For one thing, I think the idea of explosive conquest of the universe once a certain threshold is passed (whether or not humans are in the loop) has too strong a hold on people’s imaginations. I understand the logic of it, but it’s a stereotyped scenario now.
Also, I just don’t buy this idea of “life goes on, but with robots and space colonies”. Somewhere I noticed a passage about superintelligence being released to the public, as if it was an app. Even if you managed to create this Culture-like scenario, in which anyone can ask for anything from a ubiquitous superintelligence but it makes sure not to fulfil wishes that are damaging in some way… you are then definitely in a world in which superintelligence is running things. I don’t believe in an elite human minority who have superintelligence in a bottle and then get to dole it out. Once you create superintelligence, it’s in charge. Even if it’s benevolent, humans and humans life are not likely to go on unchanged, there is too much that humans can hope for that would change them and their world beyond recognition.
Anyway, that’s my impulsive first reaction, eventually I’ll do a more sober and studied response…
I understand these two paths simply as:
- a scenario of aligned AI
- a scenario of not aligned AI
The aligned AI by definition is a machine whose values (~will) is similar to the values of humans.
If this is the case, then if people want something, then the AI wants it too. If people want to be agentic, then they are agentic—because the AI wants it and allows them for that.
In the second scenario people become irrelevant. They get wiped out. The machine then proceeds with the realisation of its desires. The desires are what people had injected in it. In this prediction the desires and values are:
- scientific/AI research—coming from the agency properties (LLM in a for loop?)
- making impression of somebody friendly—coming from the RLHF-like techniques in which the output of the LLM has to be accepted by various people and people-made criteria.