Suggested reframing for judging AGI lab leaders: think less about what terminal values AGI lab leaders pursue and think more about how they trade-off power/instrumental goals with other values.
Claim 1: The end values of AGI lab leaders matter mostly if they win the AGI race and have crushed competition, but much less for all the decisions leading up there (i.e. from now to the post-AGI world).
Claim 1bis: Additionally, in the event where they have no competition and are ruling the world, even someone like Sam Altman seems to have mostly good values (e.g. see all his endeavours around fusion, world basic income etc.).
Claim 2: What matters the most during the AGI race (and before any DSA) is the propensity of an AGI lab leader to forego an opportunity to grab more power/resources in favor of other valuable things (e.g. safety, benefit-sharing etc.). The main reason for that is that at all points during the AGI race, and in particular late game, you can systematically get (a lot!) more expected power if you trade-off safety, governance or other valuable things. This is the main dynamic at play predictive of AGI labs obsessing over developing AI R&D first, of Sama’s various moves detrimental to safety.
Corollary 2a: A corollary of that is that many leaders sympathetic to safety (including sama) are frequently pursuing Pareto-pushing safety interventions (i.e. interventions that don’t reduce their power) such as good safety research etc. The main difficulties arise whenever safety trades off with capabilities development & power (which is unfortunately frequent).
IMO, reasonableness and epistemic competence are also key factors. This includes stuff like how effectively they update on evidence, how much they are pushed by motivated reasoning, how good are they at futurism and thinking about what will happen. I’d also include “general competence”.
Suggested reframing for judging AGI lab leaders: think less about what terminal values AGI lab leaders pursue and think more about how they trade-off power/instrumental goals with other values.
Claim 1: The end values of AGI lab leaders matter mostly if they win the AGI race and have crushed competition, but much less for all the decisions leading up there (i.e. from now to the post-AGI world).
Claim 1bis: Additionally, in the event where they have no competition and are ruling the world, even someone like Sam Altman seems to have mostly good values (e.g. see all his endeavours around fusion, world basic income etc.).
Claim 2: What matters the most during the AGI race (and before any DSA) is the propensity of an AGI lab leader to forego an opportunity to grab more power/resources in favor of other valuable things (e.g. safety, benefit-sharing etc.). The main reason for that is that at all points during the AGI race, and in particular late game, you can systematically get (a lot!) more expected power if you trade-off safety, governance or other valuable things. This is the main dynamic at play predictive of AGI labs obsessing over developing AI R&D first, of Sama’s various moves detrimental to safety.
Corollary 2a: A corollary of that is that many leaders sympathetic to safety (including sama) are frequently pursuing Pareto-pushing safety interventions (i.e. interventions that don’t reduce their power) such as good safety research etc. The main difficulties arise whenever safety trades off with capabilities development & power (which is unfortunately frequent).
IMO, reasonableness and epistemic competence are also key factors. This includes stuff like how effectively they update on evidence, how much they are pushed by motivated reasoning, how good are they at futurism and thinking about what will happen. I’d also include “general competence”.
Agreed that those are complementary. I didn’t mean to say that the factor I flagged is the only important one.