My overall takeaway is all these things are generally good, though insufficient to actually address many X-risk-related concerns.
All of the standards assume we can wait until after a very powerful system is trained to evaluate it, which by my current understanding would not address risks of deception.
Frankly, this is more than I would have expected the white house to do, and thus I think a positive update on likely future actions.
Yeah, I think the reference class for me here is other things the executive branch might have done, which leads me to “wow, this was way more than I expected”.
Worth noting is that they at least are trying to address deception by including it in the full bill readout. The type of model they hope to regulate here include those that permit “the evasion of human control or oversight through means of deception or obfuscation”. The director of the OMB also has to come up with tests and safeguards for “discriminatory, misleading, inflammatory, unsafe, or deceptive outputs”.
My overall takeaway is all these things are generally good, though insufficient to actually address many X-risk-related concerns.
All of the standards assume we can wait until after a very powerful system is trained to evaluate it, which by my current understanding would not address risks of deception.
Frankly, this is more than I would have expected the white house to do, and thus I think a positive update on likely future actions.
Yeah, I think the reference class for me here is other things the executive branch might have done, which leads me to “wow, this was way more than I expected”.
Worth noting is that they at least are trying to address deception by including it in the full bill readout. The type of model they hope to regulate here include those that permit “the evasion of human control or oversight through means of deception or obfuscation”. The director of the OMB also has to come up with tests and safeguards for “discriminatory, misleading, inflammatory, unsafe, or deceptive outputs”.
Wow, that’s actually great!