But for a Friendly AI to be corrupted by power would be like it starting to bleed red blood. The tendency to be corrupted by power is a specific biological adaptation, supported by specific cognitive circuits, built into us by our genes for a clear evolutionary reason. It wouldn’t spontaneously appear in the code of a Friendly AI any more than its transistors would start to bleed.
There’s a thought. While not FAIs, I wonder how much LLMs are corrupted by how much power they are primed to consider that they have. I am guessing a huge amount. When speaking as if a person with higher status I expect it to convey more self serving arguments.
There’s a thought. While not FAIs, I wonder how much LLMs are corrupted by how much power they are primed to consider that they have. I am guessing a huge amount. When speaking as if a person with higher status I expect it to convey more self serving arguments.
Anyone know if this has been studied?