“Safety Culture for AI” is important, but isn’t going to be easy

Link post

This is a linkpost (to the EA forum version of this post, which is) for a new preprint, entitled “Building a Culture of Safety for AI: Perspectives and Challenges,” and a brief explanation of the central points. Comments on the ideas in the post are welcome, but much of the content which clarifies the below is in the full manuscript.

Safety culture in AI is going to be critical for many of the other promising initiatives for AI safety.

  1. If people don’t care about safety, most safety measures turn into box-ticking. Companies that don’t care avoid regulation, or render it useless. That’s what happens when fraudulent companies are audited, or when car companies cheat on emissions tests.

  2. If people do care about safety, then audits, standards, and various risk-analysis tools can help get them there.

  3. Culture can transform industries, and norms about trying to be safe can be really powerful as a way to notice and discourage bad actors.

However, there are lots of challenges to making such a culture.

  1. Safety culture usually requires agreement about the risks. We don’t have that in AI generally.

  2. Culture depends on the operational environment.

    1. When people have risks reinforced by always being exposed to them, or personally being affected by failures, they pay more attention. In AI, most risks are rare, occur in the future, and/​or affect others more than the people responsible.

    2. Most safety cultures are built around routines such as checklists and exercises that deal with current risks. Most AI risks aren’t directly amenable to these approaches, so we can’t reinforce culture with routines.

  3. Cultures are hard to change once they get started.

    1. AI gets cultural norms from academia, where few consider risks from their work, and there are norms of openness, and from the startup world, where companies generally want to “move fast and break things.”

    2. AI companies aren’t prioritizing safety over profits—unlike airlines, nuclear power operators, or hospitals, where there is a clear understanding that safety is a critical need, and everything will stop if there is a safety problem.

    3. Companies aren’t hiring people who care about safety culture. But people build culture, and even if management wants to prioritize safety, lots of people who don’t care won’t add up to organizations that do care.

    4. We need something other than routinization to reinforce safety culture.

Thankfully, there are some promising approaches, especially on the last point. These include identifying future risks proactively via various risk analysis methods, red-teaming, and audits. But as noted above, audits are most useful once safety culture is prioritized—though there is some promise in the near-term for audits to make lack of safety common knowledge.

Next steps include building the repertoire of tools that will reduce risks and can be used to routinize and inculcate safety culture in the industry, and getting real buy-in from industry leaders for prioritizing safety.

Thanks to Jonas Schuett, Shaun Ee, Simeon Campos, Tom David, Joseph Rogero, Sebastian Lodemann, and Yonaton Cale for helpful suggestions on the manuscript.