“Meta is controlled purely by Zuckerberg and xAI follows the whims of Musk.”
Isn’t this actually a comparatively good situation? As far as I know, neither of these people wants to die, so if it comes to an existential crunch, they might make decisions that avoid dying. Compare that with amorphous control by corporate beaurocracy, in which no invididual human can manage to shift the decision...
If you think that an AI developer can do more harm than good on the margin, e.g., because you can unilaterally push the frontier by deploying a model but you cannot unilaterally pause, and other similar asymmetries, then you may favour lower variance in the policies of AI developers. It seems likely to me that individual control increases policy variance, and so that is a reasons to favour distributed/diffused control over AI developers.
It also seems empirically that individually-controlled AI developers (Meta, xAI, DeepSeek) are worse on safety than more diffusely controlled ones (OpenAI, Anthropic, Google DeepMind), which may suggest there are selection processes that cause that generally. For example, maybe those individuals tend to be especially risk-taking, or optimistic on safety, etc.
I agree that individual control increases policy variance, which was sort of my point. Whether that’s good or not seems to me to depend on what the default course of events is. If you think things are headed in a good direction, then low variance is good. But if the default course is likely to be disastrous, high variance at least provides a chance.
I don’t understand your point about asymmetry. Doesn’t that tend to make the default course bad?
I don’t understand your point about asymmetry. Doesn’t that tend to make the default course bad?
What I meant was, imagine two worlds:
Individual Control, where AI developers vary wildly in their approach to risk, safety, and deployment
Diffused Control, where AI developers tend to take similar approaches to risk, safety, and deployment
If in scenario A risk-reducing actions reduce risk as much as risk-increasing actions increase risk (i.e., payoffs are symmetrical), then these two worlds have identical risk. But if in scenario B payoffs are symmetrical (i.e., these companies are more able to increase risk than they are to decrease risks), then the Diffused Control world has lower overall risk. A single reckless outlier can dominate the outcome, and reckless outliers are more likely in the Individual Control world.
Does that make the default course bad? I guess so. But if it is true, it implies that having AI developers controlled by individuals is worse than having them run by committee.
We would also need to account for the possibility that an AI researcher at Meta or xAI prompts an actual leader to race harder (think of DeepCent’s role in the AI-2027 forecast) or comes up with a breakthrough, initiates the explosion and ends up with Agent-4 who is misaligned and Agent-3 who doesn’t catch Agent-4 because xAI’s safety team doesn’t have a single human competent enough. If this happens, then the company is never oversighted, races as hard as it can and dooms mankind.
However, if Agent-4 is caught, but P(OC member votes for slowdown) is smaller than 0.5 due to the evidence being inconclusive, then the more members the OC has, the bigger p(doom) is. On the other hand, this problem may be arguably solved by adopting the liberum veto on trusting any model...
So a big safety team is good for catching Agent-4, but may be bad for deciding whether it is guilty.
“Meta is controlled purely by Zuckerberg and xAI follows the whims of Musk.”
Isn’t this actually a comparatively good situation? As far as I know, neither of these people wants to die, so if it comes to an existential crunch, they might make decisions that avoid dying. Compare that with amorphous control by corporate beaurocracy, in which no invididual human can manage to shift the decision...
If you think that an AI developer can do more harm than good on the margin, e.g., because you can unilaterally push the frontier by deploying a model but you cannot unilaterally pause, and other similar asymmetries, then you may favour lower variance in the policies of AI developers. It seems likely to me that individual control increases policy variance, and so that is a reasons to favour distributed/diffused control over AI developers.
It also seems empirically that individually-controlled AI developers (Meta, xAI, DeepSeek) are worse on safety than more diffusely controlled ones (OpenAI, Anthropic, Google DeepMind), which may suggest there are selection processes that cause that generally. For example, maybe those individuals tend to be especially risk-taking, or optimistic on safety, etc.
I agree that individual control increases policy variance, which was sort of my point. Whether that’s good or not seems to me to depend on what the default course of events is. If you think things are headed in a good direction, then low variance is good. But if the default course is likely to be disastrous, high variance at least provides a chance.
I don’t understand your point about asymmetry. Doesn’t that tend to make the default course bad?
What I meant was, imagine two worlds:
Individual Control, where AI developers vary wildly in their approach to risk, safety, and deployment
Diffused Control, where AI developers tend to take similar approaches to risk, safety, and deployment
If in scenario A risk-reducing actions reduce risk as much as risk-increasing actions increase risk (i.e., payoffs are symmetrical), then these two worlds have identical risk. But if in scenario B payoffs are symmetrical (i.e., these companies are more able to increase risk than they are to decrease risks), then the Diffused Control world has lower overall risk. A single reckless outlier can dominate the outcome, and reckless outliers are more likely in the Individual Control world.
Does that make the default course bad? I guess so. But if it is true, it implies that having AI developers controlled by individuals is worse than having them run by committee.
We would also need to account for the possibility that an AI researcher at Meta or xAI prompts an actual leader to race harder (think of DeepCent’s role in the AI-2027 forecast) or comes up with a breakthrough, initiates the explosion and ends up with Agent-4 who is misaligned and Agent-3 who doesn’t catch Agent-4 because xAI’s safety team doesn’t have a single human competent enough. If this happens, then the company is never oversighted, races as hard as it can and dooms mankind.
However, if Agent-4 is caught, but P(OC member votes for slowdown) is smaller than 0.5 due to the evidence being inconclusive, then the more members the OC has, the bigger p(doom) is. On the other hand, this problem may be arguably solved by adopting the liberum veto on trusting any model...
So a big safety team is good for catching Agent-4, but may be bad for deciding whether it is guilty.