johnswentworth comments on Preregistration: Air Conditioner Test

johnswentworth 22 Apr 2022 2:12 UTC
13 points
On the way home today, I was thinking about what kinds of AI strategies would look more promising in a world where people are more likely to actually notice problems—i.e. what kinds of strategies I’d invest more effort into if the air conditioner test (and things like it) surprises me.
One example strategy: make AI companies liable for problems caused by their AI, as much as possible. For example, if a programming AI generates a piece of code with a bug in it, and that bug causes some huge disaster for the company using the code, then the AI company should be liable for the damage. This will incentivize AI companies to fix problems before they reach production, rather than waiting on the (much slower) consumer feedback cycle.
Also, it will incentivize AI companies to build safety teams with actual teeth, rather than the kind of mostly-performative safety teams which regulations tend to create. In particular, that means tools to probe what the AI has learned and whether the AI has learned what it was supposed to learn.
In world where the important problems are hard to notice, I expect this to mostly not work—the incentive will mostly result in AIs which do bad things subtly enough that nobody ever notices and sues. But in a world where problems are generally noticed, liability would help a lot to align AI companies’ incentives.