I don’t have a strong opinion about how good or bad this is.
But it seems like potentially additional evidence over how difficult it is to predict/understand people’s motivations/intentions/susceptibility to value drift, even with decades of track record, and thus how counterfactually-low the bar is for AIs to be more transparent to their overseers than human employees/colleagues.
I don’t have a strong opinion about how good or bad this is.
But it seems like potentially additional evidence over how difficult it is to predict/understand people’s motivations/intentions/susceptibility to value drift, even with decades of track record, and thus how counterfactually-low the bar is for AIs to be more transparent to their overseers than human employees/colleagues.