Orpheus16 comments on Reframing the burden of proof: Companies should prove that models are safe (rather than expecting auditors to prove that models are dangerous)

Orpheus16 27 Apr 2023 19:11 UTC
4 points
2
Can you say more about what part of this relates to a ban on AI development?
I think the claim “AI development should be regulated in a way such that the burden of proof is on developers to show beyond-a-reasonable-doubt that models are safe” seems quite different from the claim “AI development should be banned”, but it’s possible that I’m missing something here or communicating imprecisely.
- aphyer 27 Apr 2023 21:00 UTC
  4 points
  0
  Parent
  Apologies, I was a bit blunt here.
  It seems to me that the most obvious reading of “the burden of proof is on developers to show beyond-a-reasonable-doubt that models are safe” is in fact “all AI development is banned”. It’s...not clear at all to me what a proof of a model being safe would even look like, and based on everything I’ve heard about AI Alignment (admittedly mostly from elsewhere on this site) it seems that no-one else knows either.
  A policy of ‘developers should have to prove that their models are safe’ would make sense in a world where we had a clear understanding that some types of model were safe, and wanted to make developers show that they were doing the safe thing and not the unsafe thing. Right now, to the best of my understanding, we have no idea what is safe and what isn’t.
  If you have some idea of what a ‘proof of safety’ would look like under your system, could you say more about that? Are there any existing AI systems you think can satisfy this requirement?
  From my perspective the most obvious outcomes of a burden-of-proof policy like you describe seem to be:
  - If it is interpreted literally and enforced as written, it will in fact be a full ban on AI development. Actually proving an AI system to be safe is not something we can currently do.
  - Many possible implementations of it would not in fact ban AI development, but it’s not clear that what they would do would actually relate to safety. For instance, I can easily imagine outcomes like:
    AI developers are required to submit a six-thousand-page ‘proof’ of safety to the satisfaction of some government bureau. This would work out to something along the lines of ‘only large companies with compliance departments can develop AI’, which might be beneficial under some sets of assumptions that I do not particularly share?
    AI developers are required to prove some narrow thing about their AI (e.g. that their AI will never output a racial slur under any circumstances whatsoever). While again this might be beneficial under some sets of assumptions, it’s not clear that it would in fact have much relationship to AI safety.