The entire argument for avoiding frontier labs falls apart if you admit even a 20% likelihood that frontier labs will create aligned superintelligence,
Sure, but given that none of the frontier labs seem remotely on track to align anything, superintelligence or otherwise, that’s an extraordinary claim which requires extraordinary levels of evidence.
The frontier labs have certainly succeeded at aligning their models. LLMs have achieved a level of alignment people wouldn’t have dreamed of 10 years ago.
Now labs are running into issues with the reasoning models, but this doesn’t at all seem insurmountable.
Contemporary AI models are not “aligned” in any sense that would help the slightest bit against a superintelligence. You need stronger guardrails against stronger AI capabilities, and current “alignment” doesn’t even prevent stuff like ChatGPT’s recent sycophancy, or jailbreaking.
Sure, but given that none of the frontier labs seem remotely on track to align anything, superintelligence or otherwise, that’s an extraordinary claim which requires extraordinary levels of evidence.
The frontier labs have certainly succeeded at aligning their models. LLMs have achieved a level of alignment people wouldn’t have dreamed of 10 years ago.
Now labs are running into issues with the reasoning models, but this doesn’t at all seem insurmountable.
Contemporary AI models are not “aligned” in any sense that would help the slightest bit against a superintelligence. You need stronger guardrails against stronger AI capabilities, and current “alignment” doesn’t even prevent stuff like ChatGPT’s recent sycophancy, or jailbreaking.