I agree with your argument completely. This subject has been sitting with me for a while, and I’ve been hesitant to raise it for fear of losing newbie karma — but I genuinely can’t understand why the prevailing sentiment toward OAI management (and Altman specifically) across LessWrong seems capped at a kind of tepid disdain rather than outrage.
Consider the conjunction:
The general consensus here has been, and continues to be, that ASI poses existential risk.
We are now living through an arms race toward its development, catalyzed in large part by OAI’s role as first mover.
The decision to initiate this race was a direct consequence of choices Altman made with full visibility into the consequences — in a role he obtained through an explicit promise to steward AI development against exactly the kind of competitive social opportunism he then executed, to near-maximal degree.
So: why don’t we hate this guy?
I’m not sure what additional evidence could more plausibly be obtained to support the claim that he is doing everything conceivable to accelerate the existential risk timeline — all while continuing to serve as the public authority and representative for the very category of expertise and preferences exchanged in this community, and taking every action available to contravene its influence.
I think FDT would literally prescribe more vitriol and “cancelling” in response to defections of this magnitude than I currently see. What am I missing?
I don’t disagree—but none of this is related to the argument made here about charitable giving by the foundation, which is conditional on allowing that the planned surrender of control of the for-profit was an OK decision and accepting their other stated views about charitable mission, AI timelines, etc.
I don’t consider that to be factually the case. It may be true of how you intended the article to be received but the post ends with: “I argue that given their values and expectations, everyone should agree that this is far too little, moving far too slowly” And my reply was, in amended terms: “Yes, I agree with that, everyone should as well, and I am surprised that everyone is not outraged by how little has been done to remain aligned to their initial charter and mission thus far.”
I don’t see how this could be interpreted as maligned or off topic by any stretch to the document itself. Is it prohibited to react to the piece of news received as news, or a claim about how everyone should feel about something with a question about why they don’t feel that way already?
Beyond that, the headlining statements of the document were news to me and may have been to many others. The implied expectation that it’s my responsibility to interpret solely one quality of the news (which is, generally much less significant than the facts you layered about it being the largest heist in modern history) I don’t consider to be a fair expectation of two-way discourse.
You were responding to a post that said specific things, not entering a discussion where you present related ideas which you want to talk about. If you want to talk about the news the post was responding to, or to talk about their general lack of mission alignment, you can do that in a separate post—or at the very least say that you have a separate related point from the topic of article.
I see the problem now. I failed to observe the content as a link post by skimming over the words ‘in the full post’. I then failed to observe you were presenting a link that you wrote and not additional news articles, and then reacted to your summary of the more detailed argument as if it was the entire content.
And then instead of carefully re-examining why I missed the point, I doubled down on the justification within the maligned epistemics. You are right, given that observation it was not the appropriate place for the comment and even worse the pushback. I am sorry for wasting your time.
Socratic question—what does being (especially within the in-group community) publicly being outraged at Altman achieve?
Can follow up with more thoughts later, but I am interested in your views as to the utility of various group postures (we can, for the sake of doing the interesting part of the discussion and not the boring part, assume we don’t need more evidence and that your claim is true).
I consider the failures of alignment to date almost all a consequence of social and narrative capture and control versus any paper solution worked out by this group. CEV is probably generally correct. FDT is a very good formalization for a hard problem.
When OpenAI was created it was structured, institutionally, to be a bastion of the exact theory preferences and discourse long championed by this forum and MIRI. The challenges in getting LW ‘preferences’ actualized have never been its formalizations or the magnitude and precision of its claims (to date) — they have been the failure of the risk model well understood by this group to propagate outside of it, to the operators of institutions and the social forces that orient them.
The failure to sufficiently mitigate ASI existential risk while knowing it was coming is a problem in influence, not ‘right-ness’. Rather than adopting a more aggressive strategy toward the forces which continue to prevent the risk, it seems that dispassionate resignation or further detached analysis of failure modes becomes the exception handler — as opposed to calling truth to power in such a way that new norms are established memetically, and by osmosis to the broader public narrative which operates much more commonly on arguments of authority.
It is incredibly common that policy is formulated on sentiment, and sentiment is memetically contagious. If we genuinely believe that what Altman is doing is bad, and it is a necessary operationalization of our beliefs to continuously and relentlessly undermine that influence through arguments with emotive uptake, then why are the avenues that may make such an exercise effective not actualized rather than merely studied?
One of the primary motivators in human cognition is status. The activation of status-reducing social mechanisms — ostracism, reputational cost, public moral censure — against those who defect on alignment commitments is an ancient enforcement strategy, and one that remains psychologically powerful precisely where legal frameworks offer no protection. I don’t think any of these claims are at all structurally novel. A direct and labelled antagonist to the preferences of a group has historically been one of the strongest forces for public assembly in history.
Though it is a morally scrupulous strategy in regularity, I would argue this type of operationalization of narrative control is exactly what Altman has mastered, and unless this game is played, formalizations discussed here will forever remain beautiful in principle but prevented in practice the second that the next ‘superalignment’ team gets sidelined from implementing them.
Imagine you solved alignment tomorrow where SI could be built in a way to benefit all beings equally and democratically according to CEV—but the only way to implement it was by getting Altman to sign off on not profiting from its deployment. What are our odds of success and operational tools now? That problem continues to exist so long as people aren’t emotively loaded enough from both reason and psychological friction to go beyond an action threshold to attempt re-arrangement of either Altman’s values or sphere of control directly on alignment’s behalf.
The same argument can be made about whether it is rational to ‘beat someone up’ who defected, even if there is no consequential reward for the retribution. I am not advocating for violence but I am saying that we have rational reason for seeking a reputational re-balancing of public opinion, as being the ‘type’ of person to whom defection is costly, is what FDT can conclude structurally prevents misalignment from opportunism.
The operationalization of aligned or unaligned SI is being determined in a political, institutional, and narrative knife fight by people who all understand this. Is it the preferences of this group to be proven right about the risks or to actually prevent them? Because that is a human alignment problem, which is a value-loading problem, where the only available syntax to write solutions is meme.
I agree with your argument completely. This subject has been sitting with me for a while, and I’ve been hesitant to raise it for fear of losing newbie karma — but I genuinely can’t understand why the prevailing sentiment toward OAI management (and Altman specifically) across LessWrong seems capped at a kind of tepid disdain rather than outrage.
Consider the conjunction:
The general consensus here has been, and continues to be, that ASI poses existential risk.
We are now living through an arms race toward its development, catalyzed in large part by OAI’s role as first mover.
The decision to initiate this race was a direct consequence of choices Altman made with full visibility into the consequences — in a role he obtained through an explicit promise to steward AI development against exactly the kind of competitive social opportunism he then executed, to near-maximal degree.
So: why don’t we hate this guy?
I’m not sure what additional evidence could more plausibly be obtained to support the claim that he is doing everything conceivable to accelerate the existential risk timeline — all while continuing to serve as the public authority and representative for the very category of expertise and preferences exchanged in this community, and taking every action available to contravene its influence.
I think FDT would literally prescribe more vitriol and “cancelling” in response to defections of this magnitude than I currently see. What am I missing?
I don’t disagree—but none of this is related to the argument made here about charitable giving by the foundation, which is conditional on allowing that the planned surrender of control of the for-profit was an OK decision and accepting their other stated views about charitable mission, AI timelines, etc.
I don’t consider that to be factually the case. It may be true of how you intended the article to be received but the post ends with: “I argue that given their values and expectations, everyone should agree that this is far too little, moving far too slowly” And my reply was, in amended terms: “Yes, I agree with that, everyone should as well, and I am surprised that everyone is not outraged by how little has been done to remain aligned to their initial charter and mission thus far.”
I don’t see how this could be interpreted as maligned or off topic by any stretch to the document itself. Is it prohibited to react to the piece of news received as news, or a claim about how everyone should feel about something with a question about why they don’t feel that way already?
Beyond that, the headlining statements of the document were news to me and may have been to many others. The implied expectation that it’s my responsibility to interpret solely one quality of the news (which is, generally much less significant than the facts you layered about it being the largest heist in modern history) I don’t consider to be a fair expectation of two-way discourse.
You were responding to a post that said specific things, not entering a discussion where you present related ideas which you want to talk about. If you want to talk about the news the post was responding to, or to talk about their general lack of mission alignment, you can do that in a separate post—or at the very least say that you have a separate related point from the topic of article.
I see the problem now. I failed to observe the content as a link post by skimming over the words ‘in the full post’. I then failed to observe you were presenting a link that you wrote and not additional news articles, and then reacted to your summary of the more detailed argument as if it was the entire content.
And then instead of carefully re-examining why I missed the point, I doubled down on the justification within the maligned epistemics. You are right, given that observation it was not the appropriate place for the comment and even worse the pushback. I am sorry for wasting your time.
Socratic question—what does being (especially within the in-group community) publicly being outraged at Altman achieve?
Can follow up with more thoughts later, but I am interested in your views as to the utility of various group postures (we can, for the sake of doing the interesting part of the discussion and not the boring part, assume we don’t need more evidence and that your claim is true).
Thanks for engaging the comment!
I consider the failures of alignment to date almost all a consequence of social and narrative capture and control versus any paper solution worked out by this group. CEV is probably generally correct. FDT is a very good formalization for a hard problem.
When OpenAI was created it was structured, institutionally, to be a bastion of the exact theory preferences and discourse long championed by this forum and MIRI. The challenges in getting LW ‘preferences’ actualized have never been its formalizations or the magnitude and precision of its claims (to date) — they have been the failure of the risk model well understood by this group to propagate outside of it, to the operators of institutions and the social forces that orient them.
The failure to sufficiently mitigate ASI existential risk while knowing it was coming is a problem in influence, not ‘right-ness’. Rather than adopting a more aggressive strategy toward the forces which continue to prevent the risk, it seems that dispassionate resignation or further detached analysis of failure modes becomes the exception handler — as opposed to calling truth to power in such a way that new norms are established memetically, and by osmosis to the broader public narrative which operates much more commonly on arguments of authority.
It is incredibly common that policy is formulated on sentiment, and sentiment is memetically contagious. If we genuinely believe that what Altman is doing is bad, and it is a necessary operationalization of our beliefs to continuously and relentlessly undermine that influence through arguments with emotive uptake, then why are the avenues that may make such an exercise effective not actualized rather than merely studied?
One of the primary motivators in human cognition is status. The activation of status-reducing social mechanisms — ostracism, reputational cost, public moral censure — against those who defect on alignment commitments is an ancient enforcement strategy, and one that remains psychologically powerful precisely where legal frameworks offer no protection. I don’t think any of these claims are at all structurally novel. A direct and labelled antagonist to the preferences of a group has historically been one of the strongest forces for public assembly in history.
Though it is a morally scrupulous strategy in regularity, I would argue this type of operationalization of narrative control is exactly what Altman has mastered, and unless this game is played, formalizations discussed here will forever remain beautiful in principle but prevented in practice the second that the next ‘superalignment’ team gets sidelined from implementing them.
Imagine you solved alignment tomorrow where SI could be built in a way to benefit all beings equally and democratically according to CEV—but the only way to implement it was by getting Altman to sign off on not profiting from its deployment. What are our odds of success and operational tools now? That problem continues to exist so long as people aren’t emotively loaded enough from both reason and psychological friction to go beyond an action threshold to attempt re-arrangement of either Altman’s values or sphere of control directly on alignment’s behalf.
The same argument can be made about whether it is rational to ‘beat someone up’ who defected, even if there is no consequential reward for the retribution. I am not advocating for violence but I am saying that we have rational reason for seeking a reputational re-balancing of public opinion, as being the ‘type’ of person to whom defection is costly, is what FDT can conclude structurally prevents misalignment from opportunism.
The operationalization of aligned or unaligned SI is being determined in a political, institutional, and narrative knife fight by people who all understand this. Is it the preferences of this group to be proven right about the risks or to actually prevent them? Because that is a human alignment problem, which is a value-loading problem, where the only available syntax to write solutions is meme.