This NIST Risk Management approach sounds great, if AI Alignment was a mature field whose underlying subject matter wasn’t itself advancing extremely fast — if only we could do this! But currently I think that for many of our risk estimates it would be hard to get agreement between topic experts at even an order of magnitude scale (e.g.: is AGI misalignment >90% likely or <10% likely? YMMV). I think we should aspire to be a field mature enough that formal Risk Management is applicable, and in some areas of short-term misuse risks from current well-understood models (open-source ones, say), that might even be approaching feasibility. But for the existential AGI-level risks several years out, which are the ones that matter most, the field just isn’t mature enough yet.
Bear in mind that there was a point in the development of nuclear safety where people built open-circuit air-cooled nuclear reactors where the air-flow past the pile went up a chimney venting the hot air up into the atmosphere (yes, really: https://en.wikipedia.org/wiki/Windscale_Piles ). They didn’t then know better, and there’s an area near Ravenglass in England that’s still somewhat contaminated as a result after one caught fire. Nuclear safety knowledge has since advanced to be able to do elaborate Risk Management, and a good thing too, but unfortunately AI safety isn’t that mature yet.
This NIST Risk Management approach sounds great, if AI Alignment was a mature field whose underlying subject matter wasn’t itself advancing extremely fast — if only we could do this! But currently I think that for many of our risk estimates it would be hard to get agreement between topic experts at even an order of magnitude scale (e.g.: is AGI misalignment >90% likely or <10% likely? YMMV). I think we should aspire to be a field mature enough that formal Risk Management is applicable, and in some areas of short-term misuse risks from current well-understood models (open-source ones, say), that might even be approaching feasibility. But for the existential AGI-level risks several years out, which are the ones that matter most, the field just isn’t mature enough yet.
Bear in mind that there was a point in the development of nuclear safety where people built open-circuit air-cooled nuclear reactors where the air-flow past the pile went up a chimney venting the hot air up into the atmosphere (yes, really: https://en.wikipedia.org/wiki/Windscale_Piles ). They didn’t then know better, and there’s an area near Ravenglass in England that’s still somewhat contaminated as a result after one caught fire. Nuclear safety knowledge has since advanced to be able to do elaborate Risk Management, and a good thing too, but unfortunately AI safety isn’t that mature yet.