My memory agrees with Rohin. Some safety people were trying to hold the line but most weren’t, I don’t think it reached as much consensus as this CoT monitorability just did.
I think it’s good to defend lines even if they get blown past, unless you have a better strategy this trades off against. “Defense in depth” “Fighting retreat” etc.
(Unimportant: My position has basically been that process-based reinforcement and outcome-based reinforcement are interesting to explore separately but that mixing them together would be bad, and also, process-based reinforcement will not be competitive capabilities-wise.)
My memory agrees with Rohin. Some safety people were trying to hold the line but most weren’t, I don’t think it reached as much consensus as this CoT monitorability just did.
I think it’s good to defend lines even if they get blown past, unless you have a better strategy this trades off against. “Defense in depth” “Fighting retreat” etc.
(Unimportant: My position has basically been that process-based reinforcement and outcome-based reinforcement are interesting to explore separately but that mixing them together would be bad, and also, process-based reinforcement will not be competitive capabilities-wise.)