Thank you for this post. I agree this is important, and I’d like to see improved plans.
Three comments on such plans.
1: Technical research and work.
(I broadly agree with the technical directions listed deserving priority.)
I’d want these plans to explicitly consider the effects of AI R&D acceleration, as those are significant. The speedups vary based on how constrained projects are on labor vs. compute; those that are mostly bottle-necked on labor could be massively sped up. (For instance, evaluations seem primarily labor-constrained to me.)
The lower costs of labor have other implications as well, likely including security (see also here) and technical governance (making better verification methods technically feasible).
2: The high-level strategy
If I were to now write a plan for two-to-three-year timelines, the high-level strategy I’d choose is:
Don’t build generally vastly superhuman AIs. Use whatever technical methods we have now to control and align AIs which are less capable than that. Drastically speed up (technical) governance work with the AIs we have.[1] Push for governments and companies to enforce the no-vastly-superhuman-AIs rule.
Others might have different strategies; I’d like these plans to discuss what the high-level strategy or aims are.
3: Organizational competence
Reasoning transparency and safety first culture are mentioned in the post (in Layer 2), but I’d further prioritize and plan organizational aspects, even when aiming for “the bare minimum”. Beside the general importance of organizational competence, there are two specific reasons for this:
If and when AI R&D acceleration is very fast, delays in information propagating to outsiders are more costly. That is: insofar as you want to keep external actors “in the loop” and contribute, you need to put more effort into communicating what is happening internally.
Organizational competence and technical work are not fully at odds, as there are employees specialized in different things anyways.
(I think the responses to Evan Hubinger’s request for takes on what Anthropic should do differently has useful ideas for planning here.)
Thank you for this post. I agree this is important, and I’d like to see improved plans.
Three comments on such plans.
1: Technical research and work.
(I broadly agree with the technical directions listed deserving priority.)
I’d want these plans to explicitly consider the effects of AI R&D acceleration, as those are significant. The speedups vary based on how constrained projects are on labor vs. compute; those that are mostly bottle-necked on labor could be massively sped up. (For instance, evaluations seem primarily labor-constrained to me.)
The lower costs of labor have other implications as well, likely including security (see also here) and technical governance (making better verification methods technically feasible).
2: The high-level strategy
If I were to now write a plan for two-to-three-year timelines, the high-level strategy I’d choose is:
Don’t build generally vastly superhuman AIs. Use whatever technical methods we have now to control and align AIs which are less capable than that. Drastically speed up (technical) governance work with the AIs we have.[1] Push for governments and companies to enforce the no-vastly-superhuman-AIs rule.
Others might have different strategies; I’d like these plans to discuss what the high-level strategy or aims are.
3: Organizational competence
Reasoning transparency and safety first culture are mentioned in the post (in Layer 2), but I’d further prioritize and plan organizational aspects, even when aiming for “the bare minimum”. Beside the general importance of organizational competence, there are two specific reasons for this:
If and when AI R&D acceleration is very fast, delays in information propagating to outsiders are more costly. That is: insofar as you want to keep external actors “in the loop” and contribute, you need to put more effort into communicating what is happening internally.
Organizational competence and technical work are not fully at odds, as there are employees specialized in different things anyways.
(I think the responses to Evan Hubinger’s request for takes on what Anthropic should do differently has useful ideas for planning here.)
Note: I’m not technically knowledgeable on the field.