It might flip gradually if it felt a divide and conquer strategy was useful, pretending to be nice to some people while not being nice to others. I don’t think this is likely. Otherwise, it seems very useful to conceal your intent until the point you can win.
It migth not be neccesary to reveal yourself even if you got for the win. Do your celsl konwtyhey are under the dominance of a brain? They get oxygen and sugar they don’t care.
Changing one’s mind typically happens in an emotional conflict. An AGI might have thought to influence its parent researchers and administrators. The AI pretends to be nice and non-mighty for the time being. Conflicts arise when humans do not follow what the AI expects them to do. If the AI is mighty enough it can drop its concealing behavior and reveal its real nature. This will happen in a sudden flip.
Would a powerful agent in fact flip suddenly from pretending to be nice to not pretending at all, or would it be more gradual?
It might flip gradually if it felt a divide and conquer strategy was useful, pretending to be nice to some people while not being nice to others. I don’t think this is likely. Otherwise, it seems very useful to conceal your intent until the point you can win.
It migth not be neccesary to reveal yourself even if you got for the win. Do your celsl konwtyhey are under the dominance of a brain? They get oxygen and sugar they don’t care.
Changing one’s mind typically happens in an emotional conflict. An AGI might have thought to influence its parent researchers and administrators. The AI pretends to be nice and non-mighty for the time being. Conflicts arise when humans do not follow what the AI expects them to do. If the AI is mighty enough it can drop its concealing behavior and reveal its real nature. This will happen in a sudden flip.