Excited to see this! I’d be most excited about case studies of standards in fields where people didn’t already have clear ideas about how to verify safety.
In some areas, it’s pretty clear what you’re supposed to do to verify safety. Everyone (more-or-less) agrees on what counts as safe.
One of the biggest challenges with AI safety standards will be the fact that no one really knows how to verify that a (sufficiently-powerful) system is safe. And a lot of experts disagree on the type of evidence that would be sufficient.
Are there examples of standards in other industries where people were quite confused about what “safety” would require? Are there examples of standards that are specific enough to be useful but flexible enough to deal with unexpected failure modes or threats? Are there examples where the standards-setters acknowledged that they wouldn’t be able to make a simple checklist, so they requested that companies provide proactive evidence of safety?
Congratulations on launching!
On the governance side, one question I’d be excited to see Apollo (and ARC evals & any other similar groups) think/write about is: what happens after a dangerous capability eval goes off?
Of course, the actual answer will be shaped by the particular climate/culture/zeitgeist/policy window/lab factors that are impossible to fully predict in advance.
But my impression is that this question is relatively neglected, and I wouldn’t be surprised if sharp newcomers were able to meaningfully improve the community’s thinking on this.