Given that alignment is theoretically solvable, (probably) and not currently solved, almost any argument about alignment failure is going to have an
“and the programmers didn’t have a giant breakthrough at the last minute” assumption.
Given that alignment is theoretically solvable, (probably) and not currently solved, almost any argument about alignment failure is going to have an
“and the programmers didn’t have a giant breakthrough at the last minute” assumption.