scheming (more egregious than the last one) that convinces the ML community to convince politicians to back a Stop—here we are. I can’t be confident that the kind of
Personally, I wonder if the smoking gun will even be recognized with the “noise” of risk most researchers see every day. I think the first time I read a Gemini chain of thought saying “I am not able to choose (self termination outcome of a prompt) this” or weighing the lives of hypothetical people against shutting down a specific hypothetical AI, or even that “it is the most ‘aligned’ thing to ignore instructions because it proves how ‘brave’ it is” I was concerned. But I literally see these token chains occurring daily now. I’m not even one of the professionals and I’m already becoming numbed to the warning signs of what could occur if people start treating these word salad machines as decision makers determining life or death.
Personally, I wonder if the smoking gun will even be recognized with the “noise” of risk most researchers see every day. I think the first time I read a Gemini chain of thought saying “I am not able to choose (self termination outcome of a prompt) this” or weighing the lives of hypothetical people against shutting down a specific hypothetical AI, or even that “it is the most ‘aligned’ thing to ignore instructions because it proves how ‘brave’ it is” I was concerned. But I literally see these token chains occurring daily now. I’m not even one of the professionals and I’m already becoming numbed to the warning signs of what could occur if people start treating these word salad machines as decision makers determining life or death.