My default model before reading this post was: some people are very predisposed to craziness spirals. They’re behaviorally well-described as “looking for something to go crazy about”, not necessarily in a reflectively-endorsed sense, but in the sense that whenever they stumble across something about which one could go crazy (like e.g. lots of woo-stuff), they’ll tend to go into a spiral around it.
“AI is likely to kill us all” is definitely a thing in response to which one can fall into a spiral-of-craziness, so we naturally end up “attracting” a bunch of people who are behaviorally well-described as “looking for something to go crazy about”. (In terms of pattern matching, the most extreme examples tend to be the sorts of people who also get into quantum suicide, various flavors of woo, poorly executed anthropic arguments, poorly executed acausal trade arguments, etc.)
Other people will respond to basically-the-same stimuli by just… choosing to not go crazy (to borrow a phrase from Nate). They’ll see the same “AI is likely to kill us all” argument and respond by doing something useful, or just ignoring it, or doing something useless but symbolic and not thinking too hard about it. But they won’t panic and then selectively engage with things which amplify their own panic (i.e. craziness-spiral behavior).
On that model, insofar as EAs and rationalists sometimes turn crazy, it’s mostly a selection effect. “AI is likely to kill us all” is a kind of metaphorical flypaper for people predisposed to craziness spirals.
After reading the post… the main place where the OP’s model and my previous default model conflict is in the extent to which craziness is determined by intrinsic characteristics vs environment. Not yet sure how to resolve that.
My default model had been “a large cluster of the people who are able to use their reasoning to actually get involved in the plot of humanity, have overridden many schelling fences and absurdity heuristics and similar, and so are using their reasoning to make momentous choices, and just weren’t strong enough not to get some of it terribly wrong”. Similar to the model from reason as memetic immune disorder.
I don’t think Sam believed that AI was likely to kill that many people, or if it did, that it would be that bad (since the AI might also have conscious experiences that are just as valuable as the human ones). I also think Leverage didn’t really have much of an AI component. I think the LaSota crew maybe has a bit more of that, but I also feel like none of their beliefs are very load-bearing on AI, so I feel like this model doesn’t predict reality super well.
Huh, I remember talking to him about this, and my sense was that he thought the counterfactual of unaligned AI compared to the counterfactual of whatever humanity would do instead, was relatively small (compared to someone with a utilitarian mindset deciding on the future), though also of course that there were some broader game-theoretic considerations that make it valuable to coordinate with humanity more broadly.
Separately, his probability on AI Risk seemed relatively low, though I don’t remember any specific probability. Looking at the future fund worldview prize, I do see 15% as the position that at least the Future Fund endorsed, conditional on AI happening by 2070 (which I think Sam thought was plausible but not that likely), which is a good amount, so I think I must be misremembering at least something here.
My default model before reading this post was: some people are very predisposed to craziness spirals. They’re behaviorally well-described as “looking for something to go crazy about”, not necessarily in a reflectively-endorsed sense, but in the sense that whenever they stumble across something about which one could go crazy (like e.g. lots of woo-stuff), they’ll tend to go into a spiral around it.
“AI is likely to kill us all” is definitely a thing in response to which one can fall into a spiral-of-craziness, so we naturally end up “attracting” a bunch of people who are behaviorally well-described as “looking for something to go crazy about”. (In terms of pattern matching, the most extreme examples tend to be the sorts of people who also get into quantum suicide, various flavors of woo, poorly executed anthropic arguments, poorly executed acausal trade arguments, etc.)
Other people will respond to basically-the-same stimuli by just… choosing to not go crazy (to borrow a phrase from Nate). They’ll see the same “AI is likely to kill us all” argument and respond by doing something useful, or just ignoring it, or doing something useless but symbolic and not thinking too hard about it. But they won’t panic and then selectively engage with things which amplify their own panic (i.e. craziness-spiral behavior).
On that model, insofar as EAs and rationalists sometimes turn crazy, it’s mostly a selection effect. “AI is likely to kill us all” is a kind of metaphorical flypaper for people predisposed to craziness spirals.
After reading the post… the main place where the OP’s model and my previous default model conflict is in the extent to which craziness is determined by intrinsic characteristics vs environment. Not yet sure how to resolve that.
My default model had been “a large cluster of the people who are able to use their reasoning to actually get involved in the plot of humanity, have overridden many schelling fences and absurdity heuristics and similar, and so are using their reasoning to make momentous choices, and just weren’t strong enough not to get some of it terribly wrong”. Similar to the model from reason as memetic immune disorder.
I don’t think Sam believed that AI was likely to kill that many people, or if it did, that it would be that bad (since the AI might also have conscious experiences that are just as valuable as the human ones). I also think Leverage didn’t really have much of an AI component. I think the LaSota crew maybe has a bit more of that, but I also feel like none of their beliefs are very load-bearing on AI, so I feel like this model doesn’t predict reality super well.
I think he at least pretended to believe this, no? I heard him say approximately this when I attended a talk/Q&A with him once.
Huh, I remember talking to him about this, and my sense was that he thought the counterfactual of unaligned AI compared to the counterfactual of whatever humanity would do instead, was relatively small (compared to someone with a utilitarian mindset deciding on the future), though also of course that there were some broader game-theoretic considerations that make it valuable to coordinate with humanity more broadly.
Separately, his probability on AI Risk seemed relatively low, though I don’t remember any specific probability. Looking at the future fund worldview prize, I do see 15% as the position that at least the Future Fund endorsed, conditional on AI happening by 2070 (which I think Sam thought was plausible but not that likely), which is a good amount, so I think I must be misremembering at least something here.