I want to acknowledge that if this isn’t at all what’s going on in spaces like Less Wrong, it might be hard to demonstrate that fact conclusively. So if you’re really quite sure that the AI problem is basically as you think it is, and that you’re not meaningfully confused about it, then it makes a lot of sense to ignore this whole consideration as a hard-to-falsify distraction.
“Conclusively” is usually the bar for evidence usable to whack people over the head when they’re really determined to not see reality. If one is in fact interested in seeing reality, there’s already plenty of evidence relevant to the post’s model.
One example: forecasting AI timelines is an activity which is both strategically relevant to actual AI risk, and emotionally relevant to people drawn to doom stories. But there’s a large quantitative difference in how salient or central timelines are, strategically vs emotionally, relative to other subtopics. In particular: I claim that timelines are much more relatively salient/central emotionally than they are strategically. So, insofar as we see people focused on timelines out of proportion to their strategic relevance (relative to other subtopics), that lends support to Val’s model.
An example in the opposite direction: technical interpretability work seems much more relatively central strategically than emotionally. So, insofar as we see people focused on technical interp work out of proportion to its emotional salience, that’s evidence against Val’s model.
Looking at many kinds of evidence along these lines, my overall impression is that there is a distinct subpopulation of AI safety people for whom Val’s model is basically correct; such people tend to consistently amplify the emotionally salient things out of proportion to their strategic relevance across multiple areas. And then there’s another cluster of people who are basically engaging strategically, people for whom Val’s model is basically wrong. (And in terms of numbers, I would guess that the emotional cluster are a majority of people involved.)
I like your analysis. I haven’t thought deeply about the particulars, but I agree that we should be able to observe evidence one way or the other right now. I’ve just found it prohibitively difficult (probably due to my own lack of skill) to encourage honestly looking at the present evidence in this particular case. So I was hoping that we could set some predictions and then revisit them, to bypass the metacognitive blindspot effect.
(And in terms of numbers, I would guess that the emotional cluster are a majority of people involved.)
That’s my gut impression too. And part of what I care about here is, if the proportion of such people is large enough, the social dynamics of what seems salient will be shaped by these emotional mechanisms, and that effect will masquerade as objectivity. Social impacts on epistemology are way stronger than I think most people realize or account for.
Suppose there is a consensus belief, and suppose that it’s totally correct. If funders, and more generally anyone who can make stuff happen (e.g. builders and thinkers), use this totally correct consensus belief to make local decisions about where to allocate resources, and they don’t check the global margin, then they will in aggregrate follow a portfolio of strategies that is incorrect. The make-stuff-happeners will each make happen the top few things on their list, and leave the rest undone. The top few things will be what the consensus says is most important——in our case, projects that help if AGI comes within 10 years. If a project helps in 30 years, but not 10 years, then it doesn’t get any funding at all. This is not the right global portfolio; it oversaturates fast interventions and leaves slow interventions undone.
“Conclusively” is usually the bar for evidence usable to whack people over the head when they’re really determined to not see reality. If one is in fact interested in seeing reality, there’s already plenty of evidence relevant to the post’s model.
One example: forecasting AI timelines is an activity which is both strategically relevant to actual AI risk, and emotionally relevant to people drawn to doom stories. But there’s a large quantitative difference in how salient or central timelines are, strategically vs emotionally, relative to other subtopics. In particular: I claim that timelines are much more relatively salient/central emotionally than they are strategically. So, insofar as we see people focused on timelines out of proportion to their strategic relevance (relative to other subtopics), that lends support to Val’s model.
An example in the opposite direction: technical interpretability work seems much more relatively central strategically than emotionally. So, insofar as we see people focused on technical interp work out of proportion to its emotional salience, that’s evidence against Val’s model.
Looking at many kinds of evidence along these lines, my overall impression is that there is a distinct subpopulation of AI safety people for whom Val’s model is basically correct; such people tend to consistently amplify the emotionally salient things out of proportion to their strategic relevance across multiple areas. And then there’s another cluster of people who are basically engaging strategically, people for whom Val’s model is basically wrong. (And in terms of numbers, I would guess that the emotional cluster are a majority of people involved.)
I like your analysis. I haven’t thought deeply about the particulars, but I agree that we should be able to observe evidence one way or the other right now. I’ve just found it prohibitively difficult (probably due to my own lack of skill) to encourage honestly looking at the present evidence in this particular case. So I was hoping that we could set some predictions and then revisit them, to bypass the metacognitive blindspot effect.
That’s my gut impression too. And part of what I care about here is, if the proportion of such people is large enough, the social dynamics of what seems salient will be shaped by these emotional mechanisms, and that effect will masquerade as objectivity. Social impacts on epistemology are way stronger than I think most people realize or account for.
Obvious enough but worth saying explicitly: It’s not just impacts on epistemology, but also collective behavior. From https://www.lesswrong.com/posts/sTDfraZab47KiRMmT/views-on-when-agi-comes-and-on-strategy-to-reduce#My_views_on_strategy :