I understand these two paths simply as: - a scenario of aligned AI - a scenario of not aligned AI
The aligned AI by definition is a machine whose values (~will) is similar to the values of humans. If this is the case, then if people want something, then the AI wants it too. If people want to be agentic, then they are agentic—because the AI wants it and allows them for that.
In the second scenario people become irrelevant. They get wiped out. The machine then proceeds with the realisation of its desires. The desires are what people had injected in it. In this prediction the desires and values are: - scientific/AI research—coming from the agency properties (LLM in a for loop?) - making impression of somebody friendly—coming from the RLHF-like techniques in which the output of the LLM has to be accepted by various people and people-made criteria.
I understand these two paths simply as:
- a scenario of aligned AI
- a scenario of not aligned AI
The aligned AI by definition is a machine whose values (~will) is similar to the values of humans.
If this is the case, then if people want something, then the AI wants it too. If people want to be agentic, then they are agentic—because the AI wants it and allows them for that.
In the second scenario people become irrelevant. They get wiped out. The machine then proceeds with the realisation of its desires. The desires are what people had injected in it. In this prediction the desires and values are:
- scientific/AI research—coming from the agency properties (LLM in a for loop?)
- making impression of somebody friendly—coming from the RLHF-like techniques in which the output of the LLM has to be accepted by various people and people-made criteria.