Honestly, this is such a bad reply by Scott that I… don’t quite know whether I want to work on all of this anymore.
If this is how this ecosystem wants to treat people trying their hardest to communicate openly about the risks, and who are trying to somehow make sense of the real adversarial pressures they are facing, then I don’t think I want anything to do with it.
I have issues with Rob’s top-level tweet. I think it gets some things wrong, but it points at a real dynamic. It’s kind of strawman-y about things, and this makes some of Scott’s reaction more understandable, but his response overall seems enormously disproportionate.
Scott’s response is extremely emblematic of what I’ve experienced in the space. Simultaneous extreme insults and obviously bad faith arguments (“actually, it’s your fault that Deepmind was founded because you weren’t careful enough with your comms”), and then gaslighting that no one faces any censure for being open about these things (despite the very thing you are reading being extremely aggro about the lack of strategic communication), and actually we should be happy that Ilya started another ASI lab, and that Jan Leike has some compute budget.
The whole “no you are actually responsible for Deepmind” thing, in a tweet defending that it’s great that all of our resources are going into Anthropic, is just totally absurd. I don’t know what is going on with Scott here, but this is clearly not a high-quality response.
Copying my replies from Twitter, but I am also seriously considering making this my last day. It’s not the kind of decision to be made at 5AM in the morning so who knows, but seriously, fuck this.
IMO this doesn’t seem like the kind of response you will endorse in a few days, especially the “You are responsible for Deepmind/OpenAI” part.
You were also talking about AI close to the same time, and you’ve historically been pretty principled about this kind of stance.
you could argue that you’re not against strategicness in general, just talking about this one issue of saying cleanly that AI is very dangerous.
Robby at least has been very consistent on this that he is against most forms of strategic communication in general.
I also think you are against many forms of strategic communication in general? Your writing explores many of the relevant considerations in a lot of depth, and you certainly have not shied away from sharing your opinion on controversial issues, even when it wasn’t super clear how that is going to help things.
I think you are just arguing the wrong side of this specific argument branch. My model of Eliezer, Nate and Robby all have been pretty consistent that being overly strategic in conversation usually backfires. Of course you shouldn’t have no strategy, and my model of Eliezer in-particular has been in the past too strategic for my tastes and so might disagree with this, but I am pretty confident Robby himself is just pretty solidly on the “it’s good to blurt out what you believe, *especially* if you don’t have any good confident inside view model about how to make things better”.
In exchange, we would have won a couple more years of timeline, which would have been pointless, because timeline isn’t measured in distance from the year 1 AD, it’s measured in distance between some level of woken-up-ness and some point of danger, and the woken-up-ness would be pushed forward at the same rate the danger was.
I feel like we both know this is a strawman. The key thing at least in recent years that Rob, Eliezer and Nate have been arguing for is the political machinery necessary to actually control how fast you are building ASI, and the ability to stop for many years at a time, and to only proceed when risks actually seem handled.
If anything, Eliezer, Nate and Robby have been actively trying to move political will from “a pause right now” to “the machinery for a genuine stop”.
This makes this comparison just weird. Yes, according to everyone’s models the only time you might have the political will to stop will be in the future. I have never seen Nate or Eliezer or Robby say that they expect to get a stop tomorrow. But they of course also know that getting in a position to stop takes a long time, and the right time to get started on that work was yesterday.
So if they had their way (with their present selves teleported back in time) is that we would have more draft treaties, more negotiation between the U.S. and China. More materials ready to hand congress people who are trying to grapple with all of this stuff. Essays and books and movies and videos explaining the AI existential risk case straightforwardly to every audience imaginable.
That is what you could do if you took the 200+ risk-concerned people who ended up instead going to work at Anthropic, or ended up trying to play various inside-game politics things at OpenAI.
And man, I don’t know, but that just seems like a much better world. Maybe you disagree, which is fine, but please don’t create a strawman where Robby or Nate or Eliezer were ever really centrally angling for a short-termed pause that would have already passed by-then.
And then even beyond that, I think if you don’t know how to solve a problem, I think it is generally the virtuous thing to help other people get more surface area on solving it. Buying more time is the best way to do that, especially buying time now when the risks are pretty intuitive. I think you believe this too, and I don’t really know what’s going with your reaction here.
But my impression is that the rest of the field is executing this portfolio plan admirably, but MIRI and a few other PauseAI people are trying to sabotage every other strategy in the portfolio in the hope of forcing people into theirs.
Come on man, a huge number of people we both respect have recently updated that the kind of direct advocacy that MIRI has been doing has been massively under-invested in. I do not think that “other people are executing this portfolio plan admirably”, and this is just such a huge mischaracterization of the dynamics of this situation that I don’t know where to start.
“If Anyone Builds It, Everyone Dies” is a straightforward book. It doesn’t try to sabotage every other strategy in the portfolio, and I have no idea how you could characterize really any of the media appearances of Nate this way.
This is of course in contrast to Open Phil defunding almost everyone who has been pursuing this strategy and making mine and tons of other people’s lives hell, and all kinds of complicated adversarial shit that I’ve been having to deal with for years, where absolutely there have been tons of attempts to sabotage people trying to pursue strategies like this.
Like man, we can maybe argue about the magnitude of the errors here, and the sabotage or whatever, but trying to characterize this as some kind of “Nate, Eliezer, Robby are defecting on other people trying to be purely cooperative” seems absurd to me. I am really confused what is going on here.
We wouldn’t have the head of the leading AI lab writing letters to policymakers begging them to “jolt awake”, we wouldn’t have a substantial fraction of world compute going to Jan Leike’s alignment efforts, we wouldn’t have Ilya sitting on $50 billion for some super-secret alignment project
I am sympathetic to the first of these (but disagree you are characterizing Dario here correctly).
But come on, clearly Ilya sitting on $50 billion for starting another ASI company is not good news for the world. I don’t think you believe that this is actually a real ray of hope.
(And then I also don’t think that Jan Leike having marginally more compute is going to help, but maybe there is a more real disagreement here)
Overall, I am so so so tired of the gaslighting here.
If this is how this ecosystem wants to treat people trying their hardest to communicate openly about the risks, and who are trying to somehow make sense of the real adversarial pressures they are facing, then I don’t think I want anything to do with it.
I don’t think Scott speaks for the ecosystem. He’s just a guy in it, and one who isn’t even that closely connected to Anthropic or Coefficient Giving people. (E.g. you spend >10x as much time talking to people from those orgs as he does.) I think that the people in the ecosystem you’re criticizing would not approve of Scott’s post.
This is of course in contrast to Open Phil defunding almost everyone who has been pursuing this strategy and making mine and tons of other people’s lives hell, and all kinds of complicated adversarial shit that I’ve been having to deal with for years, where absolutely there have been tons of attempts to sabotage people trying to pursue strategies like this.
I think this is not a good summary of what Coefficient Giving has done. (I do think it really sucks that they defunded Lightcone.)
I think that the people in the ecosystem you’re criticizing would not approve of Scott’s post.
I think this is false. I expect Scott’s post to be heavily upvoted, if it was posted to the EA Forum to have an enormously positive agree/disagree ratio, and in-general for people to believe something pretty close to it.
There are a few exceptions (somewhat ironically a good chunk of the cG AI-risk people), but they would be relatively sparse. I think this is roughly what someone who is smart, but doesn’t have a strong inside-view take about what they should do about AI-risk believes that they should act like if they want to be a good member of the EA community. My guess is it’s also pretty close to what leadership at cG, CEA and Anthropic believe, plus it would poll pretty well at a thing like SES.
He’s just a guy in it, and one who isn’t even that closely connected to Anthropic or Coefficient Giving people.
The issue is of course not that Scott is right or wrong about what Anthropic or cG people believe. The issue is that he seems to be taking a view where you should be super strategic in your communications, sneer at anyone who is open about things, and measure your success in how many of your friends are now at the levers of power.
I think this is not a good summary of what Coefficient Giving has done.
I think cG’s funding decisions were really very centrally about trying to punish people who weren’t being strategic in their communications in the way that Dustin wanted them to be strategic in their communication’s.
I think other “all kinds of complicated adversarial shit” has also happened, though it’s harder to point to. At a minimum I will point to the fact that invitation decisions to things like SES have followed similar adversarial “you aren’t cooperating with our strategic communications” principles.
I think this is false. I expect Scott’s post to be heavily upvoted, if it was posted to the EA Forum to have an enormously positive agree/disagree ratio, and in-general for people to believe something pretty close to it.
The EA Forum is a trash fire, so who knows what would happen if this was published there.
My read of the social dynamics is that in places where people are inclined to defer to me or people like me, they might initially approve of the Scott thing for bad tribal reasons, but change their mind when they read criticism of it from me or someone like me (which is ofc part of why I sometimes bother commenting on things like this).
My guess is it’s also pretty close to what leadership at cG, CEA and Anthropic believe, plus it would poll pretty well at a thing like SES.
I think that Scott’s post would not overall be received positively by those people. Maybe you’re saying that one of the directions argued for by Scott’s post is approved of by those people? I agree with that more.
My read of the social dynamics is that in places where people are inclined to defer to me or people like me, they might initially approve of the Scott thing for bad tribal reasons, but change their mind when they read criticism of it from me or someone like me
Well, I mean, that is a hard conditional to be false since if people were to not change their mind, this would largely invalidate the premise that they are declined to defer to you. Unfortunately, I both think the vast majority of places in EA do not defer to you or people like you, and furthermore, I also think you are pretty importantly wrong about your criticisms, so I don’t quite know how to feel about this.
I do think it helps and am marginally happy about your cultural influence here (though it’s tricky, I also think a bunch of your takes here are quite dumb). I think the vast majority of the cultural influence here is downstream of not quite anyone in-particular, but more Anthropic than anywhere else, and neither you nor me can change that very much.
I think that Scott’s post would not overall be received positively by those people.
Yeah, I expect it to be straightforwardly positively received. I think people will be like “some parts of this seem dumb, the Ilya thing in-particular, but yeah, fuck those rationalists and MIRI people, I am with Scott on that”.
To be clear, I am not expecting consensus here, I think this will be what 75% of people who have any opinion at all on anything adjacent on this believe, but I expect people would broadly think it’s a good contribution that properly establishes norms and reflects how they think about things.
I also think it’s plausible people would be like “wow, what an uncough way that both of these people are interfacing with each other, please get away from each other children”, but then actually if you talked to them afterwards, they would be like “yeah, I mean, that was a bit of a shitshow but I do think Scott was basically right here (minus 1-2 minor things)”.
I am not enormously confident on this, but it matches my experiences of the space.
I agree with Habryka that absent criticism Scott’s post would be well received by an important group of people reasonably characterized as EA-ish AI safety people.
Imo absent criticism Rob’s post would be well received by a different group of people reasonably characterized as doomers. (Literally right before seeing this thread I saw another post on LW that is directionally correct but is mostly wrong or exaggerated in its details, and that was very well received.)
Both posts are broadly wrong about lots of things, about equally so, such that most people would be better off having never encountered either of them.
Tbc, my first-order intuitive impression is that Scott’s post is much more directionally accurate. But I expect that is because I constantly experience people knifing me, pushing me to take strategies that systematically destroy my ability to do anything while gaining approximately no safety benefit, or making claims about members of groups that include me that are false of me, whereas I don’t really experience any of the stuff that Rob gestures at, even though I expect it exists. Though Rob’s post doesn’t actually inform me of it, because his actual claims are false, and I cannot infer the underlying experiences that led him to make them. Another example of trapped priors if you don’t have second order corrections. (Tbc his follow-up post makes this substantially clearer.)
You probably already know I think this, but imo you should both quit on making public discourse in the AI safety community non-insane, and do other things that have a shot at working. (Since I know this will be misinterpreted by other readers, let me be clear that there are plenty of other kinds of public writing that do not fall in that bucket which I do think are worth doing.)
I endorse you taking the space to figure out how you want to relate and doing what’s right for you, I’ve increasingly updated to thinking that people doing things they’re not wholeheartedly behind tends to be net bad in all sorts of sideways ways, but the effort would be weaker for your loss. Wherever you end up, I appreciate you having taken the strategy of speaking in public about things that usually aren’t in a way that helped clarify the strategic situation for me many times.
(also, it’s scary to see three of the people I’d put in the upper tiers of good communication and understanding where we’re at with AI technically get into this intense conflict. I’m going to be thinking on this some and seeing if anything crystalizes which might help specifically, but in the meantime a few more general-purpose posts that might be useful memes for minimizing unhelpful conflict are A Principled Cartoon Guide to NVC, NVC as Variable Scoping, and Why Control Creates Conflict, and When to Open Instead)
trying to characterize this as some kind of “Nate, Eliezer, Robby are defecting on other people trying to be purely cooperative” seems absurd to me. I am really confused what is going on here.
Everything makes sense when you meditate on how the line between “cooperation” and “defection” isn’t in the territory; it’s a computed concept that agents in a variable-sum game have every incentive to “disagree” (actually, fight) about.
Consider the Nash demand game. Two players name a number between 0 and 100. If the sum is less than or equal to 100, you get the number you named as a percentage of the pie; if the sum exceeds 100, the pie is destroyed. There’s no unique Nash equilibrium. It’s stable if Player 1 says 50 and Player 2 says 50, but it’s also stable if Player 1 says 35 and Player 2 says 65 (or generally n and 100 − n, respectively).
The secret is that there are no natural units of pie (or, equivalently, how much pie everyone “deserves”). Everyone thinks that they’re being “cooperative” and that their partners are “defecting”, because they’re counting the pie differently: Player 1 thinks their slice is 35%, but Player 2 thinks the same physical slice is 65%.
If you don’t think your partner is treating you fairly, your leverage is to threaten to destroy surplus unless they treat you better. That’s what Alexander is doing when he says, “I would like to support it with praxis, but right now I feel very conflicted about this”. He’s saying, “You’d better give me a bigger slice, Player 1, or I’ll destroy some of the pie.”
That’s also what your brain is doing when you say you don’t want to work on this anymore. Scott doesn’t want you to quit! (Partially because he values Lightcone’s work, and partially because it would look bad for him if you can publicly blame your burnout on him.) Crucially, your brain knows this. By threatening to quit in frustration, you can probably get Scott to apologize and give your arguments a fairer hearing, whereas in the absence of the threat, he has every incentive to keep being motivatedly dumb from your perspective.
You have a strong hand here! The only risk is if your counterparties don’t think you’d ever actually quit and start calling your bluff. In this case, we know Scott is a pushover and will almost certainly fold. But if you ever face stronger-willed counterparties, you might need to shore up the credibility of your threat: conspicuously going on vacation for a week to think it over will get taken more seriously than an “I don’t know if I want to do this anymore” comment.
(Sorry, maybe you already knew all that, but weren’t articulating it because it’s not part of the game? I don’t think I’m worsening your position that much by saying it out loud; we know that Scott knows this stuff.)
That’s also what your brain is doing when you say you don’t want to work on this anymore. Scott doesn’t want you to quit! (Partially because he values Lightcone’s work, and partially because it would look bad for him if you can publicly blame your burnout on him.) Crucially, your brain knows this.
Man, I really wish this was the case, and it’s non-zero of what is going on, but the vast majority of what I am expressing with my (genuine) desire to quit is the stress and frustration associated with the gaslighting, which is one level more abstract than the issue you talk about.
Like yes, there is a threat here being like “for fuck’s sake, stop gaslighting or I am genuinely going to blow up my part of the pie”, but it’s not actually about the object level, and I don’t actually have much of any genuine hope of that working in the same way one might expect from a negotiation tactic.
I am just genuinely actually very tired, and Scott changing his mind on this and going “oh yeah, actually you are right” actually wouldn’t do much to make me want to not quit, because it wouldn’t address the continuous gaslighting where every time anyone tries to talk about any of the adversarial dynamics, they immediately get told this is all made up and get repeated “I haven’t seen EAs (other than SBF) do a lot of lying, equivocating, or even being particularly shy about their beliefs” and “everyone is being honest all the time and actually it’s just you who is lying right now and always”.
Yeah, the frustrating part is almost always on a meta level. I think Zack’s point about “No natural units of pie” applies to the gaslighting issue as well though. Asserting one’s viewpoint means asserting it as truth which invalidates differing perspectives. “I disagree, you contradict, he gaslights”.
It’s difficult because sometimes the gas lights really don’t seem to be dimming, and sometimes that perception is downstream of some motivated thinking because I really don’t want to believe we’re running out of oil already, dammit. And so the result is simultaneously kinda an honest statement of perspective (at least, as honest as these tend to get) while also being a (not-necessarily-consciously) motivated action pushing people to disregard their own senses. And then we have to decide how to judge this mess of bias and honesty, and if we don’t judge such that the product after a round trip of perceiving C/D and responding accordingly we get more C than last time… shit’s fucked. And without objective units of pie that people can agree on when judging who was in the wrong.
So like… am I trying to gaslight people into questioning their own sanity so they accept what I want them to accept, or am I just flinching away from what scares me, like we all do? Both, and the question of whether I deserve the leniency and empathy is a difficult one, because what are the units of this pie and where’s the objective cutoff? And because our tolerance for further bullshit tends to diminish after accumulating bullshit, so it gets even more difficult to get back to the other side of criticality.
I really don’t think Scott is gaslighting you. I think Scott is being honest here, but you should model him as having somewhat snapped. Pause AI and MIRI-adjacent people on X have been extremely adversarial and have been contributing to very bad discourse (even arguments-wise). I think Scott saw Rob’s post as very strawmannish and needlessly adversarial, and he more or less correctly lumped it in with this rising tide of terribleness, even if MIRI itself is definitely not as guilty. I might well be wrong about the specifics, but Scott Alexander isn’t the kind of person who tends to gaslight.
I think you need to be a lot more deflationary about the g-word. If you think, “But ‘gaslighting’ is something Bad people do; Scott Alexander isn’t Bad, so he would never do that”, well, that might be true depending on what you mean by the g-word. But if the behavior Habryka is trying to point to with the word to is more like, “Scott is adopting a self-serving narrative that minimizes wrongdoing by his allies and inflates wrongdoing by his rivals” (which is something someone might do without being Bad due to having “somewhat snapped”), well, why wouldn’t the rivals reach for the g-word in their defense? What is the difference, from their perspective?
“Gaslighting” should probably be avoided because it is anywhere between meaningless and a fighting word depending on who says it and how.
The g-word is a very nasty accusation. It gets thrown around and means a bunch of stuff down to just “saying stuff I disagree with”, but it shouldn’t.
It is originally a conscious, malicious attempt to drive someone insane by strategically lying to them.
On the substance, people are honest but wrong an awful lot, and honest but massively overstating their case even more often. Assuming your rivals are malicious or dishonest when they’re just wrong or overstating is a huge source of conflict and thereby confusion.
It’s a really useful pointer towards a tactic that is relatively widespread and has no better word. I am personally happy to use other words, but I have the sense that sentences like “I am so very very tired of the ambiguous but ultimately strategic enough attempts at undermining my ability to orient in this situation by denying pretty clearly true parts of reality combined with intense implicit threats of consequences if I indicate I believe the wrong thing that might or might not be conscious optimizations happening in my interlocutors but have enough long-term coherence to be extremely unlikely to be the cause of random misunderstandings” would work that well.
Yeah I would call that “gaslighting”. It looks like my initial interpretation of what you meant by it is closer than Zack’s. I think Scott isn’t doing that. I’m inclined to believe you when you say other people have behaved this way.
“It is not the critic who counts: not the man who points out how the strong man stumbles or where the doer of deeds could have done better. The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood, who strives valiantly, who errs and comes up short again and again, because there is no effort without error or shortcoming, but who knows the great enthusiasms, the great devotions, who spends himself for a worthy cause; who, at the best, knows, in the end, the triumph of high achievement, and who, at the worst, if he fails, at least he fails while daring greatly, so that his place shall never be with those cold and timid souls who knew neither victory nor defeat.”
Theodore Roosevelt”Citizenship in a Republic,”Speech at the Sorbonne, Paris, April 23, 1910
In exchange, we would have won a couple more years of timeline, which would have been pointless, because timeline isn’t measured in distance from the year 1 AD, it’s measured in distance between some level of woken-up-ness and some point of danger, and the woken-up-ness would be pushed forward at the same rate the danger was.
I feel like we both know this is a strawman. The key thing at least in recent years that Rob, Eliezer and Nate have been arguing for is the political machinery necessary to actually control how fast you are building ASI, and the ability to stop for many years at a time, and to only proceed when risks actually seem handled.
If anything, Eliezer, Nate and Robby have been actively trying to move political will from “a pause right now” to “the machinery for a genuine stop”.
I think Scott’s “couple more years” wasn’t referring to a belief that EA could have successfully advocated for a couple of year pause, but rather referring to the change in timeline you’d have gotten if safety-sympathetic people refused to work on stuff that increases the pace of capabilities progress.
Honestly, this is such a bad reply by Scott that I… don’t quite know whether I want to work on all of this anymore.
If this is how this ecosystem wants to treat people trying their hardest to communicate openly about the risks, and who are trying to somehow make sense of the real adversarial pressures they are facing, then I don’t think I want anything to do with it.
I have issues with Rob’s top-level tweet. I think it gets some things wrong, but it points at a real dynamic. It’s kind of strawman-y about things, and this makes some of Scott’s reaction more understandable, but his response overall seems enormously disproportionate.
Scott’s response is extremely emblematic of what I’ve experienced in the space. Simultaneous extreme insults and obviously bad faith arguments (“actually, it’s your fault that Deepmind was founded because you weren’t careful enough with your comms”), and then gaslighting that no one faces any censure for being open about these things (despite the very thing you are reading being extremely aggro about the lack of strategic communication), and actually we should be happy that Ilya started another ASI lab, and that Jan Leike has some compute budget.
The whole “no you are actually responsible for Deepmind” thing, in a tweet defending that it’s great that all of our resources are going into Anthropic, is just totally absurd. I don’t know what is going on with Scott here, but this is clearly not a high-quality response.
Copying my replies from Twitter, but I am also seriously considering making this my last day. It’s not the kind of decision to be made at 5AM in the morning so who knows, but seriously, fuck this.
IMO this doesn’t seem like the kind of response you will endorse in a few days, especially the “You are responsible for Deepmind/OpenAI” part.
You were also talking about AI close to the same time, and you’ve historically been pretty principled about this kind of stance.
Robby at least has been very consistent on this that he is against most forms of strategic communication in general.
I also think you are against many forms of strategic communication in general? Your writing explores many of the relevant considerations in a lot of depth, and you certainly have not shied away from sharing your opinion on controversial issues, even when it wasn’t super clear how that is going to help things.
I think you are just arguing the wrong side of this specific argument branch. My model of Eliezer, Nate and Robby all have been pretty consistent that being overly strategic in conversation usually backfires. Of course you shouldn’t have no strategy, and my model of Eliezer in-particular has been in the past too strategic for my tastes and so might disagree with this, but I am pretty confident Robby himself is just pretty solidly on the “it’s good to blurt out what you believe, *especially* if you don’t have any good confident inside view model about how to make things better”.
I feel like we both know this is a strawman. The key thing at least in recent years that Rob, Eliezer and Nate have been arguing for is the political machinery necessary to actually control how fast you are building ASI, and the ability to stop for many years at a time, and to only proceed when risks actually seem handled.
If anything, Eliezer, Nate and Robby have been actively trying to move political will from “a pause right now” to “the machinery for a genuine stop”.
This makes this comparison just weird. Yes, according to everyone’s models the only time you might have the political will to stop will be in the future. I have never seen Nate or Eliezer or Robby say that they expect to get a stop tomorrow. But they of course also know that getting in a position to stop takes a long time, and the right time to get started on that work was yesterday.
So if they had their way (with their present selves teleported back in time) is that we would have more draft treaties, more negotiation between the U.S. and China. More materials ready to hand congress people who are trying to grapple with all of this stuff. Essays and books and movies and videos explaining the AI existential risk case straightforwardly to every audience imaginable.
That is what you could do if you took the 200+ risk-concerned people who ended up instead going to work at Anthropic, or ended up trying to play various inside-game politics things at OpenAI.
And man, I don’t know, but that just seems like a much better world. Maybe you disagree, which is fine, but please don’t create a strawman where Robby or Nate or Eliezer were ever really centrally angling for a short-termed pause that would have already passed by-then.
And then even beyond that, I think if you don’t know how to solve a problem, I think it is generally the virtuous thing to help other people get more surface area on solving it. Buying more time is the best way to do that, especially buying time now when the risks are pretty intuitive. I think you believe this too, and I don’t really know what’s going with your reaction here.
Come on man, a huge number of people we both respect have recently updated that the kind of direct advocacy that MIRI has been doing has been massively under-invested in. I do not think that “other people are executing this portfolio plan admirably”, and this is just such a huge mischaracterization of the dynamics of this situation that I don’t know where to start.
“If Anyone Builds It, Everyone Dies” is a straightforward book. It doesn’t try to sabotage every other strategy in the portfolio, and I have no idea how you could characterize really any of the media appearances of Nate this way.
This is of course in contrast to Open Phil defunding almost everyone who has been pursuing this strategy and making mine and tons of other people’s lives hell, and all kinds of complicated adversarial shit that I’ve been having to deal with for years, where absolutely there have been tons of attempts to sabotage people trying to pursue strategies like this.
Like man, we can maybe argue about the magnitude of the errors here, and the sabotage or whatever, but trying to characterize this as some kind of “Nate, Eliezer, Robby are defecting on other people trying to be purely cooperative” seems absurd to me. I am really confused what is going on here.
I am sympathetic to the first of these (but disagree you are characterizing Dario here correctly).
But come on, clearly Ilya sitting on $50 billion for starting another ASI company is not good news for the world. I don’t think you believe that this is actually a real ray of hope.
(And then I also don’t think that Jan Leike having marginally more compute is going to help, but maybe there is a more real disagreement here)
Overall, I am so so so tired of the gaslighting here.
Please don’t quit, Oliver.
Unless you mean “making this my last day [on twitter]”, which might or might not be a good idea.
I don’t think Scott speaks for the ecosystem. He’s just a guy in it, and one who isn’t even that closely connected to Anthropic or Coefficient Giving people. (E.g. you spend >10x as much time talking to people from those orgs as he does.) I think that the people in the ecosystem you’re criticizing would not approve of Scott’s post.
I think this is not a good summary of what Coefficient Giving has done. (I do think it really sucks that they defunded Lightcone.)
I think this is false. I expect Scott’s post to be heavily upvoted, if it was posted to the EA Forum to have an enormously positive agree/disagree ratio, and in-general for people to believe something pretty close to it.
There are a few exceptions (somewhat ironically a good chunk of the cG AI-risk people), but they would be relatively sparse. I think this is roughly what someone who is smart, but doesn’t have a strong inside-view take about what they should do about AI-risk believes that they should act like if they want to be a good member of the EA community. My guess is it’s also pretty close to what leadership at cG, CEA and Anthropic believe, plus it would poll pretty well at a thing like SES.
The issue is of course not that Scott is right or wrong about what Anthropic or cG people believe. The issue is that he seems to be taking a view where you should be super strategic in your communications, sneer at anyone who is open about things, and measure your success in how many of your friends are now at the levers of power.
I think cG’s funding decisions were really very centrally about trying to punish people who weren’t being strategic in their communications in the way that Dustin wanted them to be strategic in their communication’s.
I think other “all kinds of complicated adversarial shit” has also happened, though it’s harder to point to. At a minimum I will point to the fact that invitation decisions to things like SES have followed similar adversarial “you aren’t cooperating with our strategic communications” principles.
The EA Forum is a trash fire, so who knows what would happen if this was published there.
My read of the social dynamics is that in places where people are inclined to defer to me or people like me, they might initially approve of the Scott thing for bad tribal reasons, but change their mind when they read criticism of it from me or someone like me (which is ofc part of why I sometimes bother commenting on things like this).
I think that Scott’s post would not overall be received positively by those people. Maybe you’re saying that one of the directions argued for by Scott’s post is approved of by those people? I agree with that more.
Well, I mean, that is a hard conditional to be false since if people were to not change their mind, this would largely invalidate the premise that they are declined to defer to you. Unfortunately, I both think the vast majority of places in EA do not defer to you or people like you, and furthermore, I also think you are pretty importantly wrong about your criticisms, so I don’t quite know how to feel about this.
I do think it helps and am marginally happy about your cultural influence here (though it’s tricky, I also think a bunch of your takes here are quite dumb). I think the vast majority of the cultural influence here is downstream of not quite anyone in-particular, but more Anthropic than anywhere else, and neither you nor me can change that very much.
Yeah, I expect it to be straightforwardly positively received. I think people will be like “some parts of this seem dumb, the Ilya thing in-particular, but yeah, fuck those rationalists and MIRI people, I am with Scott on that”.
To be clear, I am not expecting consensus here, I think this will be what 75% of people who have any opinion at all on anything adjacent on this believe, but I expect people would broadly think it’s a good contribution that properly establishes norms and reflects how they think about things.
I also think it’s plausible people would be like “wow, what an uncough way that both of these people are interfacing with each other, please get away from each other children”, but then actually if you talked to them afterwards, they would be like “yeah, I mean, that was a bit of a shitshow but I do think Scott was basically right here (minus 1-2 minor things)”.
I am not enormously confident on this, but it matches my experiences of the space.
In case it matters to either of you, my guesses:
I agree with Habryka that absent criticism Scott’s post would be well received by an important group of people reasonably characterized as EA-ish AI safety people.
Imo absent criticism Rob’s post would be well received by a different group of people reasonably characterized as doomers. (Literally right before seeing this thread I saw another post on LW that is directionally correct but is mostly wrong or exaggerated in its details, and that was very well received.)
Both posts are broadly wrong about lots of things, about equally so, such that most people would be better off having never encountered either of them.
Tbc, my first-order intuitive impression is that Scott’s post is much more directionally accurate. But I expect that is because I constantly experience people knifing me, pushing me to take strategies that systematically destroy my ability to do anything while gaining approximately no safety benefit, or making claims about members of groups that include me that are false of me, whereas I don’t really experience any of the stuff that Rob gestures at, even though I expect it exists. Though Rob’s post doesn’t actually inform me of it, because his actual claims are false, and I cannot infer the underlying experiences that led him to make them. Another example of trapped priors if you don’t have second order corrections. (Tbc his follow-up post makes this substantially clearer.)
You probably already know I think this, but imo you should both quit on making public discourse in the AI safety community non-insane, and do other things that have a shot at working. (Since I know this will be misinterpreted by other readers, let me be clear that there are plenty of other kinds of public writing that do not fall in that bucket which I do think are worth doing.)
I endorse you taking the space to figure out how you want to relate and doing what’s right for you, I’ve increasingly updated to thinking that people doing things they’re not wholeheartedly behind tends to be net bad in all sorts of sideways ways, but the effort would be weaker for your loss. Wherever you end up, I appreciate you having taken the strategy of speaking in public about things that usually aren’t in a way that helped clarify the strategic situation for me many times.
(also, it’s scary to see three of the people I’d put in the upper tiers of good communication and understanding where we’re at with AI technically get into this intense conflict. I’m going to be thinking on this some and seeing if anything crystalizes which might help specifically, but in the meantime a few more general-purpose posts that might be useful memes for minimizing unhelpful conflict are A Principled Cartoon Guide to NVC, NVC as Variable Scoping, and Why Control Creates Conflict, and When to Open Instead)
Everything makes sense when you meditate on how the line between “cooperation” and “defection” isn’t in the territory; it’s a computed concept that agents in a variable-sum game have every incentive to “disagree” (actually, fight) about.
Consider the Nash demand game. Two players name a number between 0 and 100. If the sum is less than or equal to 100, you get the number you named as a percentage of the pie; if the sum exceeds 100, the pie is destroyed. There’s no unique Nash equilibrium. It’s stable if Player 1 says 50 and Player 2 says 50, but it’s also stable if Player 1 says 35 and Player 2 says 65 (or generally n and 100 − n, respectively).
The secret is that there are no natural units of pie (or, equivalently, how much pie everyone “deserves”). Everyone thinks that they’re being “cooperative” and that their partners are “defecting”, because they’re counting the pie differently: Player 1 thinks their slice is 35%, but Player 2 thinks the same physical slice is 65%.
If you don’t think your partner is treating you fairly, your leverage is to threaten to destroy surplus unless they treat you better. That’s what Alexander is doing when he says, “I would like to support it with praxis, but right now I feel very conflicted about this”. He’s saying, “You’d better give me a bigger slice, Player 1, or I’ll destroy some of the pie.”
That’s also what your brain is doing when you say you don’t want to work on this anymore. Scott doesn’t want you to quit! (Partially because he values Lightcone’s work, and partially because it would look bad for him if you can publicly blame your burnout on him.) Crucially, your brain knows this. By threatening to quit in frustration, you can probably get Scott to apologize and give your arguments a fairer hearing, whereas in the absence of the threat, he has every incentive to keep being motivatedly dumb from your perspective.
You have a strong hand here! The only risk is if your counterparties don’t think you’d ever actually quit and start calling your bluff. In this case, we know Scott is a pushover and will almost certainly fold. But if you ever face stronger-willed counterparties, you might need to shore up the credibility of your threat: conspicuously going on vacation for a week to think it over will get taken more seriously than an “I don’t know if I want to do this anymore” comment.
(Sorry, maybe you already knew all that, but weren’t articulating it because it’s not part of the game? I don’t think I’m worsening your position that much by saying it out loud; we know that Scott knows this stuff.)
Man, I really wish this was the case, and it’s non-zero of what is going on, but the vast majority of what I am expressing with my (genuine) desire to quit is the stress and frustration associated with the gaslighting, which is one level more abstract than the issue you talk about.
Like yes, there is a threat here being like “for fuck’s sake, stop gaslighting or I am genuinely going to blow up my part of the pie”, but it’s not actually about the object level, and I don’t actually have much of any genuine hope of that working in the same way one might expect from a negotiation tactic.
I am just genuinely actually very tired, and Scott changing his mind on this and going “oh yeah, actually you are right” actually wouldn’t do much to make me want to not quit, because it wouldn’t address the continuous gaslighting where every time anyone tries to talk about any of the adversarial dynamics, they immediately get told this is all made up and get repeated “I haven’t seen EAs (other than SBF) do a lot of lying, equivocating, or even being particularly shy about their beliefs” and “everyone is being honest all the time and actually it’s just you who is lying right now and always”.
Yeah, the frustrating part is almost always on a meta level. I think Zack’s point about “No natural units of pie” applies to the gaslighting issue as well though. Asserting one’s viewpoint means asserting it as truth which invalidates differing perspectives. “I disagree, you contradict, he gaslights”.
It’s difficult because sometimes the gas lights really don’t seem to be dimming, and sometimes that perception is downstream of some motivated thinking because I really don’t want to believe we’re running out of oil already, dammit. And so the result is simultaneously kinda an honest statement of perspective (at least, as honest as these tend to get) while also being a (not-necessarily-consciously) motivated action pushing people to disregard their own senses. And then we have to decide how to judge this mess of bias and honesty, and if we don’t judge such that the product after a round trip of perceiving C/D and responding accordingly we get more C than last time… shit’s fucked. And without objective units of pie that people can agree on when judging who was in the wrong.
So like… am I trying to gaslight people into questioning their own sanity so they accept what I want them to accept, or am I just flinching away from what scares me, like we all do? Both, and the question of whether I deserve the leniency and empathy is a difficult one, because what are the units of this pie and where’s the objective cutoff? And because our tolerance for further bullshit tends to diminish after accumulating bullshit, so it gets even more difficult to get back to the other side of criticality.
I really don’t think Scott is gaslighting you. I think Scott is being honest here, but you should model him as having somewhat snapped. Pause AI and MIRI-adjacent people on X have been extremely adversarial and have been contributing to very bad discourse (even arguments-wise). I think Scott saw Rob’s post as very strawmannish and needlessly adversarial, and he more or less correctly lumped it in with this rising tide of terribleness, even if MIRI itself is definitely not as guilty. I might well be wrong about the specifics, but Scott Alexander isn’t the kind of person who tends to gaslight.
I think you need to be a lot more deflationary about the g-word. If you think, “But ‘gaslighting’ is something Bad people do; Scott Alexander isn’t Bad, so he would never do that”, well, that might be true depending on what you mean by the g-word. But if the behavior Habryka is trying to point to with the word to is more like, “Scott is adopting a self-serving narrative that minimizes wrongdoing by his allies and inflates wrongdoing by his rivals” (which is something someone might do without being Bad due to having “somewhat snapped”), well, why wouldn’t the rivals reach for the g-word in their defense? What is the difference, from their perspective?
“Gaslighting” should probably be avoided because it is anywhere between meaningless and a fighting word depending on who says it and how.
The g-word is a very nasty accusation. It gets thrown around and means a bunch of stuff down to just “saying stuff I disagree with”, but it shouldn’t.
It is originally a conscious, malicious attempt to drive someone insane by strategically lying to them.
On the substance, people are honest but wrong an awful lot, and honest but massively overstating their case even more often. Assuming your rivals are malicious or dishonest when they’re just wrong or overstating is a huge source of conflict and thereby confusion.
It’s a really useful pointer towards a tactic that is relatively widespread and has no better word. I am personally happy to use other words, but I have the sense that sentences like “I am so very very tired of the ambiguous but ultimately strategic enough attempts at undermining my ability to orient in this situation by denying pretty clearly true parts of reality combined with intense implicit threats of consequences if I indicate I believe the wrong thing that might or might not be conscious optimizations happening in my interlocutors but have enough long-term coherence to be extremely unlikely to be the cause of random misunderstandings” would work that well.
Yeah I would call that “gaslighting”. It looks like my initial interpretation of what you meant by it is closer than Zack’s. I think Scott isn’t doing that. I’m inclined to believe you when you say other people have behaved this way.
“It is not the critic who counts: not the man who points out how the strong man stumbles or where the doer of deeds could have done better. The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood, who strives valiantly, who errs and comes up short again and again, because there is no effort without error or shortcoming, but who knows the great enthusiasms, the great devotions, who spends himself for a worthy cause; who, at the best, knows, in the end, the triumph of high achievement, and who, at the worst, if he fails, at least he fails while daring greatly, so that his place shall never be with those cold and timid souls who knew neither victory nor defeat.”
Theodore Roosevelt”Citizenship in a Republic,”Speech at the Sorbonne, Paris, April 23, 1910
Locally trying to clear up one misunderstanding.
I think Scott’s “couple more years” wasn’t referring to a belief that EA could have successfully advocated for a couple of year pause, but rather referring to the change in timeline you’d have gotten if safety-sympathetic people refused to work on stuff that increases the pace of capabilities progress.