My guess is it wouldn’t be that interesting because they all cleanly fail in every category, except Anthropic, which could argue for partial credit in a few categories (but not full credit in any of them, and Eli seems to have picked his wording such that partial credit is close to 0 credit on each question).
I’m trying to outline the minimum criteria that make it deontologically ok to be a scaling lab. Of course, you can do better or worse without hitting that bar, but you don’t get partial deontology points by getting closer.
Stealing a billion dollars is worse than stealing a million dollars, but either way, you’re a criminal.
The only one of these criteria that any of the labs are plausibly meeting is:
The company has a broadly good track record of deploying current AIs safely and responsibly, including owning up to and correcting mistakes.
Not having followed super closely, I think Anthropic and GDM reasonably meet this standard. All of xAI, OpenAI, and Meta definitively do not.
I guess that none of them have operational security adequate to protect against nationstate actors, but I don’t know enough to say one way or the other. (Also, some of them might have realistic plans and policies in place to scale up their operational security as their capabilities increase, which I would count, if security professionals agreed that their plans and policies were sound.)
It would be interesting if you gave OpenAI, Google DeepMind, Anthropic, xAI and DeepSeek scores based on how well they fit your checklist.
My guess is it wouldn’t be that interesting because they all cleanly fail in every category, except Anthropic, which could argue for partial credit in a few categories (but not full credit in any of them, and Eli seems to have picked his wording such that partial credit is close to 0 credit on each question).
I’m trying to outline the minimum criteria that make it deontologically ok to be a scaling lab. Of course, you can do better or worse without hitting that bar, but you don’t get partial deontology points by getting closer.
Stealing a billion dollars is worse than stealing a million dollars, but either way, you’re a criminal.
The only one of these criteria that any of the labs are plausibly meeting is:
Not having followed super closely, I think Anthropic and GDM reasonably meet this standard. All of xAI, OpenAI, and Meta definitively do not.
I guess that none of them have operational security adequate to protect against nationstate actors, but I don’t know enough to say one way or the other. (Also, some of them might have realistic plans and policies in place to scale up their operational security as their capabilities increase, which I would count, if security professionals agreed that their plans and policies were sound.)