Hmm, I think v3 is worse than v2. The change that’s most important to me is that the section on alignment is now merely “exploratory” and “illustrative.” (On the other hand, it is nice that v3 mentions misalignment as a potential risk from ML R&D capabilities in addition to “instrumental reasoning” / stealth-and-sabotage capabilities; previously it was just the latter.) Note I haven’t read v3 carefully.
(But both versions, like other companies’ safety frameworks, are sufficiently weak or lacking-transparency that I don’t really care about marginal changes.)
Hmm, I think v3 is worse than v2. The change that’s most important to me is that the section on alignment is now merely “exploratory” and “illustrative.” (On the other hand, it is nice that v3 mentions misalignment as a potential risk from ML R&D capabilities in addition to “instrumental reasoning” / stealth-and-sabotage capabilities; previously it was just the latter.) Note I haven’t read v3 carefully.
(But both versions, like other companies’ safety frameworks, are sufficiently weak or lacking-transparency that I don’t really care about marginal changes.)