I claim it is a lot more reasonable to use the reference class of “people claiming the end of the world” than “more powerful intelligences emerging and competing with less intelligent beings” when thinking about AI x-risk. further, we should not try to convince people to adopt the latter reference class—this sets off alarm bells, and rightly so (as I will argue in short order) - but rather to bite the bullet, start from the former reference class, and provide arguments and evidence for why this case is different from all the other cases.
this raises the question: how should you pick which reference class to use, in general? how do you prevent reference class tennis, where you argue back and forth about what is the right reference class to use? I claim the solution is you want to use reference classes that have consistently made good decisions irl. the point of reference classes is to provide a heuristic to quickly apply judgement to large swathes of situations that you don’t have time to carefully examine. this is important because otherwise it’s easy to get tied up by bad actors who avoid being refuted by making their beliefs very complex and therefore hard to argue against.
the big problem with the latter reference class is it’s not like anyone has had many experiences using it to make decisions ex ante, and if you squint really hard to find day to day examples, they don’t all work out the same way. smarter humans do mostly tend to win over less smart humans. but if you work at a zoo, you will almost always be more worried about physical strength and aggressiveness when putting different species in the same enclosure. if you run a farm (or live in Australia), you’re very worried about relatively dumb invasive animals like locusts and rabbits.
on the other hand, everyone has personally experienced a dozen different doomsday predictions. whether that’s your local church or faraway cult warning about Armageddon, or Y2K, or global financial collapse in 2008, or the maximally alarmist climate people, or nuclear winter, or peak oil. for basically all of them, the right action empirically in retrospect was to not think too much about it. there are many concrete instances of people saying “but this is different” and then getting burned.
and if you allow any reference class to be on as strong a footing as very well established reference classes, then you open yourself up to getting pwned ideologically. “all complex intricate objects we have seen created have been created by something intelligent, therefore the universe must also have an intelligent creator.” it’s a very important memetic defense mechanism.
(to be clear this doesn’t mean you can only believe things others believe, or that humans taking over earth is not important evidence, or that doomsday is impossible!! I personally think AGI will probably kill everyone. but this is a big claim and should be treated as such. if we don’t accept this, then we will forever fail to communicate with people who don’t already agree with us on AGI x-risk.)
I endeavor to look at how things work and describe them accurately. Similarly to how I try to describe how a piece of code works, or how to to build a shed, I will try to accurately describe the consequences of large machine learning runs, which can include human extinction.
I personally think AGI will probably kill everyone. but this is a big claim and should be treated as such.
This isn’t how I think about things. Reality is what exists, and if a claim accurately describes reality, then I should not want to hold it to higher standards than claims that do not describe reality. I don’t think it’s a good epistemology to rank claims by “bigness” and then say that the big ones are less likely and need more evidence. On the contrary, I think it’s worth investing more in finding out if they’re right, and generally worth bringing them up to consideration with less evidence than for “small” claims.
on the other hand, everyone has personally experienced a dozen different doomsday predictions. whether that’s your local church or faraway cult warning about Armageddon, or Y2K, or global financial collapse in 2008, or the maximally alarmist climate people, or nuclear winter, or peak oil. for basically all of them, the right action empirically in retrospect was to not think too much about it.
I don’t have the experiences you’re describing. I don’t go to churches, I don’t visit cults, I was 3yrs old in the year 2000, I was 11 for the ’08 financial crash and having read about it as an adult I don’t recall extinction being a topic of discussion, I think I have heard of climate people saying that via alarmist news headlines but I have not had anyone personally try to convince me of this or even say that they believe it. I have heard it discussed for nuclear winter, yes, and I think nukes are quite scary and it was reasonable to consider, I did not dismiss it out of hand and wouldn’t use that heuristic. I don’t know what the oil thing is.
In other words, I don’t recall anyone seriously trying to convince me that the world was ending except in cases where they had good reason to believe it. In my life, when people try to warn me about big things, especially if they’ve given it serious thought, usually I’ve found it’s been worthwhile for me to consider it. (I like to think I am good at steering clear of scammers and cranks, so that I can trust the people in my life when they tell me things.)
The sense I get from this post is that, in it, you’re assuming everyone else in the world is constantly being assaulted with claims meant to scare and control them rather than people attempting to describe the world accurately. I agree there are forces doing that, but I think this post gives up all too quickly on there being other forces in the world that aren’t doing that that people can recognize and trust.
i am also trying to accurately describe reality. what i’m saying is, even from the perspective of someone smart and truth-seeking but who doesn’t know much about the object-level, it is very reasonable to use bigness of claim as a heuristic for how much evidence you need before you’re satisfied, and that if you don’t do this, you will be worse at finding the truth in practice. my guess is this applies even more so to the average person.
i think this is very analogous to occam’s razor / trust region optimization. clearly, we need to discount theories based on complexity because there are exponentially more complex theories compared to simple ones, many of which have no easily observable difference to the simpler ones, opening you up to being pwned. and empirically it seems a good heuristic to live life by. complex theories can still be true! but given two theories that both accurately describe reality, you want the simpler one. similarly, given two equally complex claims that accurately describe the evidence, you want the one that is less far fetched from your current understanding of the world / requires changing less of your worldview.
also, it doesn’t have to be something you literally personally experienced. it’s totally valid to read the wikipedia page on the branch davidians or whatever and feel slightly less inclined to take things that have similar vibes seriously, or even to absorb the vibe from your environs (your aversion to scammers and cranks surely did not come ex nihilo, right?)
for most of the examples i raised, i didn’t necessarily mean the claim was literally 100% human extinction, and i don’t think it matters that it wasn’t. first, because the important thing is the vibe of the claim (catastrophic) - since we’re talking about heuristics on how seriously to take things that you don’t have time to deep dive on, the rule has to be relatively cheap to implement. i think most people, even quite smart people, genuinely don’t feel much of an emotional difference between literal human extinction vs collapse of society vs half of people dying painfully, unless they first spend a half hour carefully thinking about the implications of extinction. (and even then depending on their values they may still not feel a huge difference)
also, it would be really bad if you could weasel your way out of a reference class that easily; it would be rife for abuse by bad actors—“see, our weird sect of christianity claims that after armageddon, not only will all actual sinners’ souls be tortured forever, but that the devil will create every possible sinner’s soul to torture forever! this is actually fundamentally different from all existing christian theories, and it would be unfathomably worse, so it really shouldn’t be thought of as the same kind of claim”
even if most people are trying to describe the world accurately (which i think is not true and we only get this impression because we live in a strange bubble of very truth seeking people + are above-average capable at understanding things object level and therefore quickly detecting scams), ideas are still selected for memeticness. i’m sure that 90% of conspiracy theorists genuinely believe that humanity is controlled by lizards and are trying their best to spread what they believe to be true. many (not all) of the worst atrocities in history have been committed by people who genuinely thought they were on the side of truth and good.
(actually, i think people do get pwned all the time, even in our circles. rationalists are probably more likely than average (controlling for intelligence) to get sucked into obviously culty things (e.g zizians), largely because they don’t have the memetic antibodies needed to not get pwned, for one reason or another. so probably many rationalists would benefit from evaluating things a little bit more on vibes/bigness and a little bit less on object level)
Your points about Occam’s razor have got nothing to do with this subject[1]. The heuristic “be more skeptical of claims that would have big implications if true” makes sense only when you suspect a claim may have been adversarially optimized for memetic fitness; it is not otherwise true that “a claim that something really bad is going to happen is fundamentally less likely to be true than other claims”.
I’m having a little trouble connecting your various points back to your opening paragraph, which is the primary thing that I am trying to push back on.[2]
I claim it is a lot more reasonable to use the reference class of “people claiming the end of the world” than “more powerful intelligences emerging and competing with less intelligent beings” when thinking about AI x-risk. further, we should not try to convince people to adopt the latter reference class—this sets off alarm bells, and rightly so (as I will argue in short order) - but rather to bite the bullet, start from the former reference class, and provide arguments and evidence for why this case is different from all the other cases.
To restate the message I’m reading here: “Give up on having a conversation where you evaluate the evidence alongside your interlocutors. Instead frame yourself as trying to convince them of something, and assume that they are correct to treat your communications as though you are adversarially optimizing for them believing whatever you want them to believe.” This assumption seems to give up a lot of my ability to communicate with people (almost ~all of it), and I refuse to simply do it because some amount of communication in the world is adversarially optimized, and I’m definitely not going to do it because of a spurious argument that Occam’s razor implies that “claims about things being really bad or claims that imply you need to take action are fundamentally less likely to be true”.
You are often in an environment where people are trying to use language to describe reality, and in that situation the primary thing to evaluate is not the “bigness” of a claim, but the evidence for and against it. I recommend instead to act in such a way as to increase the size and occurrence of that environment more-so than “act as though it’s correct to expect maximum adversarial optimization in communications”.
(Meta: The only literal quotes of Leo’s in this comment are the big one in the quote block, my use of “” is to hold a sentence as object, they are not things Leo wrote.)
I agree that the more strongly a claim implies that you should take action, then the more you should consider that it is being optimized adversarially for you to take action. For what it’s worth, I think that heuristic applies more so to claims that you should personally take action. Most people have little action to directly prevent the end of the world from AI; this is a heuristic more naturally applied to claims that you need to pay fines (which are often scams/spam). But mostly, when people give me claims that imply action, they are honestly meant claims and I do the action. This is the vast majority of my experience.
Aside to Leo: Rather than reply point-by-point to the each of the paragraphs in the second comment, I will try restating and responding to the core message I got in the opening paragraph of the first comment. I’m doing this because the paragraphs in the second-comment seemed somewhat distantly related / I couldn’t tell whether the points were actually cruxy. They were responding to many different things, and I hope restating the core thing will better respond to your core point. However I don’t mean to avoid key arguments, if you think I have done so feel free to tell me one or two paragraphs you would especially like me to engage with and I will do so in any future reply.
in practice many of the claims you hear will be optimized for memetic fitness, even if the people making the claims are genuine. well intentioned people can still be naive, or have blind spots, or be ideologically captured.
also, presumably the people you are trying to convince are on average less surrounded by truth seeking people than you are (because being in the alignment community is strongly correlated with caring about seeking truth).
i don’t think this gives up your ability to communicate with people. you simply have to signal in some credible way that you are not only well intentioned but also not merely the carrier of some very memetic idea that slipped past your antibodies. there are many ways to accomplish this. for example, you can build up a reputation of being very scrupulous and unmindkilled. this lets you convey ideas freely to other people in your circles that are also very scrupulous and unmindkilled. when interacting with people outside this circle, for whom this form of reputation is illegible, you need to find something else. depending on who you’re talking to and what kinds of things they take seriously, this could be leaning on the credibility of someone like geoff hinton, or of sam/demis/dario, or the UK government, or whatever.
this might already be what you’re doing, in which case there’s no disagreement between us.
You’re writing lots of things here but as far as I can tell you aren’t defending your opening statement, which I believe is mistaken.
I claim it is a lot more reasonable to use the reference class of “people claiming the end of the world” than “more powerful intelligences emerging and competing with less intelligent beings” when thinking about AI x-risk. further, we should not try to convince people to adopt the latter reference class—this sets off alarm bells, and rightly so (as I will argue in short order) - but rather to bite the bullet, start from the former reference class, and provide arguments and evidence for why this case is different from all the other cases.
Firstly, it’s just not more reasonable. When you ask yourself “Is a machine learning run going to lead to human extinction?” you should not first say “How trustworthy are people who have historically claimed the world is ending?”, you should of course primarily bring your attention to questions about what sorts of machine is being built, what sort of thinking capacities it has, what sorts of actions it can take in the world, what sorts of optimization it runs, how it would behave around humans if it were more powerful than them, and so on. We can go back to discussing epistemology 101 if need be (e.g. “Hug the Query!”).
Secondly, insofar as someone believes you are a huckster or a crackpot, you should leave the conversation, communication here has broken down and you should look for other communication opportunities. However, insofar as someone is only evaluating this tentatively as one of many possible hypotheses about you then you should open yourself up to auditing / questioning by them about why you believe what you believe and your past history and your memetic influences. Being frank is the only way through this! But you shouldn’t say to them “Actually, I think you should treat me like a huckster/scammer/serf-of-a-corrupt-empire.” This feels analogous to a man on a date with a woman saying “Actually I think you should strongly privilege the hypothesis that I am willing to rape you, and now I’ll try to provide evidence for you that this is not true.” It would be genuinely a bad sign about a man that he thinks that about himself, and also he has moved the situation into a much more adversarial frame.
I suspect you could write some more narrow quick-take such as “Here is some communication advice I find helpful when talking with friends and colleagues about how AI can lead to human extinction”, but in generalizing it all the way to making dictates about basic epistemology you are making basic mistakes and getting it wrong.
Please either (1) defend and/or clarify the original statement, or (2) concede that it was mistaken, rather than writing more semi-related paragraphs about memetic immune systems.
I am confused why you think my claims are only semi related. to me my claim is very straightforward, and the things i’m saying are straightforwardly converying a world model that seems to me to explain why i believe my claim. i’m trying to explain in good faith, not trying to say random things. i’m claiming a theory of how people parse information, to justify my opening statement, which i can clarify as:
sometimes, people use the rhetorical move of saying something like “people think 95% doom is overconfident, yet 5% isn’t. but that’s also being 95% confident in not-doom, and yet they don’t consider that overconfident. curious.” followed by “well actually, it’s only a big claim under your reference class. under mine, i.e the set of all instances of a more intelligent thing emerging, actually, 95% doom is less overconfident than 5% doom” this post was inspired by seeing one such tweet, but i see such claims like this every once in a while that play reference class tennis.
i think this kind of argument is really bad at persuading people who don’t already agree (from empirical observation). my opening statement is saying “please stop doing this, if you do it, and thank you for not doing this, if you dont already do it” the rest of my paragraphs provide an explanation of my theory for why this is bad for changing people’s minds. this seems pretty obviously relevant for justifying why we should stop doing the thing. i sometimes see people out there talk like this (including my past self at some point), and then fail to convince people, and then feel very confused about why people don’t see the error of their ways when presented with an alternative reference class. if my theory is correct (maybe it isn’t, this isn’t a super well thought out take, it’s more a shower thought), then it would explain this, and people who are failing to convince people would probably want to know why they’re failing. i did not spell this out in my opening statement because i thought it was clear but in retrospect this was not clear from the opening statement
i don’t think the root cause is people being irrational epistemically. i think there is a fundamental reason why people do this that is very reasonable. i think you disagree with this on the object level and many of my paragraphs are attempting to respond to what i view as the reason you disagree. this does not explicitly show up in the opening statement, but since you disagree with this, i thought it would make sense to respond to that too
i am not saying you should explicitly say “yeah i think you should treat me as a scammer until i prove otherwise”! i am also not saying you should try to argue with people who have already stopped listening to you because they think you’re a scammer! i am merely saying we should be aware that people might be entertaining that as a hypothesis, and if you try to argue by using this particular class of rhetorical move, you will only trigger their defenses further, and that you should instead just directly provide the evidence for why you should be taken seriously, in a socially appropriate manner. if i understand correctly, i think the thing you are saying one should do is the same as the thing i’m saying one should do, but phrased in a different way; i’m saying not to do a thing that you seem to already not be doing.
i think i have not communicated myself well in this conversation, and my mental model is that we aren’t really making progress, and therefore this conversation has not brought value and joy into the world in the way i intended. so this will probably be my last reply, unless you think doing so would be a grave error.
I am confused why you think my claims are only semi related. to me my claim is very straightforward, and the things i’m saying are straightforwardly converying a world model that seems to me to explain why i believe my claim. i’m trying to explain in good faith, not trying to say random things. i’m claiming a theory of how people parse information, to justify my opening statement,
Thank you for all this. I still think your quick take is wrong on the matter of epistemology.
I acknowledge that you make a fine point about persuasion, that someone who is primarily running the heuristic that “claims about the end of the world are probably crack-pots or scammers” will not be persuaded by someone arguing that actually 20:1 against and 20:1 in favor of a claim are equally extreme beliefs.
A version of the quick take that I would’ve felt was just fine would read:
Some people have basically only heard claims of human extinction coming from crackpots and scammers, and will not have thought much about the AI extinction idea on the object level. To them, this sort of argument I’ve discussed is unpersuasive at moving beyond the “is this a crackpot/scam” part of the dialogue. In this quick take I’ll outline my model of how they’re thinking about it, and give recommendations for how you should argue instead.
But your quick take doesn’t confine itself to discussing those people in those situations. It flatly says it’s true as a matter of epistemology that you should “use bigness of claim as a heuristic for how much evidence you need before you’re satisfied”, that you should “use reference classes that have consistently made good decisions irl” and that the crackpots/scammers one is the correct one to use here otherwise you’ll risk “getting pwned ideologically”.
These aren’t always the right heuristics (e.g. on this issue they are not for you and for me) and you shouldn’t say that they are just so that some people on Twitter will stop using rhetoric that isn’t working.
I believe you’re trying to do your best to empathize with people who are unpersuaded by an unsuccessful rhetorical move, a move that people who believe your position are making in public discourse. That is commendable. I think you are attempting to cause other people who hold your position to stop using that rhetorical move, by telling them off for using it, but to acheive this aim you are repeatedly saying the people who do not hold your position are doing normatively correct epistemology, and you’re justifying it with Occam’s razor and reference class forecasting, and this is all wrong. In some situations for some people it is reasonable to primarily use theses heuristics, and in other situations for other people it is not. I’m not arguing that the people unpersuaded are being unreasonable, but (for example) your opening sentence makes fully-general statements about how to reason about this issue that I believe are false. Rule number of one of good discourse: don’t make false statements about epistemology in order to win an object level point.
Yep, seems fine to drop this here; I make no bid of you to reply further.
(I would never make knowingly false statements about epistemology to try to win an object level point; I still disagree with your claims about epistemology and believe that my epistemology arguments are in good faith and capture truth in some way. This disagreement might be because I’ve not communicated myself well. I originally wasn’t going to reply but I felt the need to say this because your comment can be viewed as accusing me of intellectual/epistemic dishonesty, even if that wasn’t your intention.)
Firstly, it’s just not more reasonable. When you ask yourself “Is a machine learning run going to lead to human extinction?” you should not first say “How trustworthy are people who have historically claimed the world is ending?”
But you should absolutely ask “does it look like I’m making the same mistakes they did, and how would I notice if it were so?” Sometimes one is indeed in a cult with your methods of reason subverted, or having a psychotic break, or captured by a content filter that hides the counterevidence, or many of the more mundane and pervasive failures in kind.
But not in full generality! This is a fine question to raise in this context, but in general the correct thing to do in basically all situations is to consider the object level, and then also let yourself notice if people are unusually insane around a subject, or insane for a particular reason. Sometimes that is the decisive factor, but for all questions, the best first pass is to think about how that part of the world works, rather than to think about the other monkeys who have talked about it in the past.
The heuristic “be more skeptical of claims that would have big implications if true” makes sense only when you suspect a claim may have been adversarially optimized for memetic fitness; it is not otherwise true that “a claim that something really bad is going to happen is fundamentally less likely to be true than other claims”.
This seems wrong to me.
a. More smaller things happen and there are fewer kinds of smaller thing that happen. b. I bet people genuinely have more evidence for small claims they state than big ones on average. c. The skepticism you should have because particular claims are frequently adversarially generated shouldn’t first depend on deciding to be skeptical about it.
If you’ll forgive the lack of charity, ISTM that leogao is making IMO largely true points about the reference class and then doing the wrong thing with those points, and you’re reacting to the thing being done wrong at the end, but trying to do this in part by disagreeing with the points being made about the reference class. leogao is right that people are reasonable in being skeptical of this class of claims on priors, and right that when communicating with someone it’s often best to start within their framing. You are right that regardless it’s still correct to evaluate the sum of evidence for and against a proposition, and that other people failing to communicate honestly in this reference class doesn’t mean we ought to throw out or stop contributing to the good faith conversations avaialable to us.
i’m not even saying people should not evaluate evidence for and against a proposition in general! it’s just that this is expensive, and so it is perfectly reasonable to have heuristics to decide which things to evaluate, and so you should first prove with costly signals that you are not pwning them, and then they can weigh the evidence. and until you can provide enough evidence that you’re not pwning them for it to be worth their time to evaluate your claims in detail, that it should not be surprising that many people won’t listen to the evidence; and that even if they do listen, if there is still lingering suspicion that they are being pwned, you need to provide the type of evidence that could persuade someone that they aren’t getting pwned (for which being credibly very honest and truth seeking is necessary but not sufficient), which is sometimes different from mere compellingness of argument
I think the framing that sits better to me is ‘You should meet people where they’re at.’ If they seem like they need confidence that you’re arguing from a place of reason, that’s probably indeed the place to start.
a. I expect there is a slightly more complicated relationship between my value-function and the likely configuration states of the universe than literally zero-correlation, but most configuration states do not support life and we are all dead, so in one sense a claim that in the future something very big and bad will happen is far more likely on priors. One might counter that we live in a highly optimized society where things being functional and maintained is an equilibrium state and it’s unlikely for systems to get out of whack enough for bad things to happen. But taking this straightforwardly is extremely naive, tons of bad things happen all the time to people. I’m not sure whether to focus on ‘big’ or ‘bad’ but either way, the human sense of these is not what the physical universe is made out of or cares about, and so this looks like an unproductive heuristic to me.
b. On the other hand, I suspect the bigger claims are more worth investing time to find out if they’re true! All of this seems too coarse-grained to produce a strong baseline belief about big claims or small claims.
c. I don’t get this one. I’m pretty sure I said that if you believe that you’re in a highly adversarial epistemic environment, then you should become more distrusting of evidence about memetically fit claims.
I don’t know what true points you think Leo is making about “the reference class”, nor which points you think I’m inaccurately pushing back on that are true about “the reference class” but not true of me. Going with the standard rationalist advice, I encourage everyone to taboo “reference class” and replace it with a specific heuristic. It seems to me that “reference class” is pretending that these groupings are more well-defined than they are.
c. I don’t get this one. I’m pretty sure I said that if you believe that you’re in a highly adversarial epistemic environment, then you should become more distrusting of evidence about memetically fit claims.
Well, sure, it’s just you seemed to frame this as a binary on/off thing, sometimes you’re exposed and need to count it and sometimes you’re not, whereas to me it’s basically never implausible that a belief has been exposed to selection pressures, and the question is of probabilities and degrees.
I think you’re correct. There’s a synergistic feedback loop between alarmism and social interaction that filters out pragmatic perspectives. Creating the illusion that the doom surrounding any given topic more prevalent than it really is, or even that it’s near universal.
Even before the rise of digital information the feedback phenomenon could be observed in any insular group. In today’s environment where a lot of effort goes into exploiting that feedback loop it requires a conscious effort to maintain perspective, or even remain aware that there are other perspectives.
I think the group of people “claiming the end of the world” in the case of AI x-risk is importantly more credentialed and reasonable-looking than most prior claims about the end of the world. From the reference class and general heuristics perspective that you’re talking about[1], I think how credible looking the people are is pretty important.
So, I think the reference class is more like claims of nuclear armageddon than cults. (Plausibly near maximally alarmist climate people are in a similar reference class.)
I agree this reference class is better, and implies a higher prior, but I think it’s reasonable for the prior over “arbitrary credentialed people warning about something” to be still relatively low in an absolute sense- lots of people have impressive sounding credentials that are not actually good evidence of competence (consider: it’s basically a meme at this point that whenever you see a book where the author puts “PhD” after their name, they probably are a grifter / their phd was probably kinda bs), and also there is a real negativity bias where fearmongering is amplified by both legacy and social media. Also, for the purposes of understanding normal people, it’s useful to keep in mind that trust in credentials and institutions is not very high right now in the US among genpop.
This is kind of missing the point of Bayes. One shouldn’t “choose” a reference class to update on. One should update to the best of your ability on the whole distribution of hypotheses available to describe the situation. Neither is a ‘right’ or ‘wrong’ reference class to use, they’re both just valid pieces of evidence about base rates, and you should probably be using both of them.
It seems you are having in mind something like inference to the best explanation here. Bayesian updating, on the other hand, does need a prior distribution, and the question of which prior distribution to use cannot be waved away when there is a disagreement on how to update. In fact, that’s one of the main problems of Bayesian updating, and the reason why it is often not used in arguments.
I’m not really sure what that has to do with my comment. My point is the original post seemed to be operating as if you look for the argmax reference class, you start there, and then you allow arguments. My point isn’t that their prior is wrong, it’s that this whole operation is wrong.
I think also you’re maybe assuming I’m saying the prior looks something like {reference class A, reference class B} and arguing about the relative probability of each, but it doesn’t, a prior should be over all valid explanations of the prior evidence. Reference classes come in because they’re evidence about base rates of particular causal structures; you can say ‘given the propensity for the world to look this way, how should I be correcting the probability of the hypotheses under consideration? Which new hypotheses should I be explicitly tracking?’
I can see where the original post might have gone astray. People have limits on what they can think about and it’s normal to narrow one’s consideration to the top most likely hypothesis. But it’s important to be aware of what you’re approximating here, else you get into a confusion where you have two valid reference classes and you start telling people that there’s a correct one to start arguing from.
I agree this is an interesting philosophical question but again I’m not sure why you’re bringing it up.
Given your link maybe you think me mentioning Bayes was referring to some method of selecting a single final hypothesis? I’m not, I’m using it to refer to the Bayesian update rule.
It seems the updating rule doesn’t tell you anything about the original argument even when you view information about reference classes as evidence rather than as a method of assigning prior probabilities to hypotheses. Or does it? Can you rephrase the argument in a proper Bayesian way such that it becomes clearer? Note that how strongly some evidence confirms or disconfirms a hypothesis also depends on a prior.
What argument are you referring to when you say “doesn’t tell you anything about the original argument”?
My framing is basically this: you generally don’t start a conversation with someone as a blank pre-priors slate that you get to inject your priors into. The prior is what you get handed, and then the question is how people should respond to the evidence and arguments available. Well, you should use (read: approximate) the basic Bayesian update rule: hypotheses where an observation is unlikely are that much less probable.
I think you’re underestimating the inferential gap here. I’m not sure why you’d think the Bayes updating rule is meant to “tell you anything about” the original post. My claim was that the whole proposal about selecting reference classes was framed badly and you should just do (approximate) Bayes instead.
You’re having a conversation with someone. They believe certain things are more probable than other things. They mention a reference class: if you look at this grouping of claims, most of them are wrong. Then you consider the set of hypotheses: under each of them, how plausible is it given the noted tendency for this grouping of claims to be wrong? Some of them pass easily, eg. the hypothesis that this is just another such claim. Some of them less easily; they are either a modal part of this group and uncommon on base rate, or else nonmodal or not part of the group at all. You continue, with maybe a different reference class, or an observation about the scenario.
Hopefully this illustrates the point. Reference classes are just evidence about the world. There’s no special operation needed for them.
for basically all of them, the right action empirically in retrospect was to not think too much about it.
False?
Climate change tail scenarios are worth studying and averting. Nuclear winter was obviously worth studying and averting back in the Cold War, and still is today. 2008 financial crisis was worth studying and averting.
Do you not believe average citizens can study issues like these and make moves to solve them?
The reference classes you should use work as a heuristic because there is some underlying mechanism that makes them work. So you should use reference classes in situations where their underlying mechanism is expected to work.
Maybe the underlying mechanism of doomsday predictions not working is that people predicting doom don’t make their predictions based on valid reasoning. So if someone uses that reference class to doubt AI risk, this should be judged as them making a claim about reasoning of people predicting AI doom being similar to people in cults predicting Armageddon.
I claim it is a lot more reasonable to use the reference class of “people claiming the end of the world” than “more powerful intelligences emerging and competing with less intelligent beings” when thinking about AI x-risk. further, we should not try to convince people to adopt the latter reference class—this sets off alarm bells, and rightly so (as I will argue in short order) - but rather to bite the bullet, start from the former reference class, and provide arguments and evidence for why this case is different from all the other cases.
this raises the question: how should you pick which reference class to use, in general? how do you prevent reference class tennis, where you argue back and forth about what is the right reference class to use? I claim the solution is you want to use reference classes that have consistently made good decisions irl. the point of reference classes is to provide a heuristic to quickly apply judgement to large swathes of situations that you don’t have time to carefully examine. this is important because otherwise it’s easy to get tied up by bad actors who avoid being refuted by making their beliefs very complex and therefore hard to argue against.
the big problem with the latter reference class is it’s not like anyone has had many experiences using it to make decisions ex ante, and if you squint really hard to find day to day examples, they don’t all work out the same way. smarter humans do mostly tend to win over less smart humans. but if you work at a zoo, you will almost always be more worried about physical strength and aggressiveness when putting different species in the same enclosure. if you run a farm (or live in Australia), you’re very worried about relatively dumb invasive animals like locusts and rabbits.
on the other hand, everyone has personally experienced a dozen different doomsday predictions. whether that’s your local church or faraway cult warning about Armageddon, or Y2K, or global financial collapse in 2008, or the maximally alarmist climate people, or nuclear winter, or peak oil. for basically all of them, the right action empirically in retrospect was to not think too much about it. there are many concrete instances of people saying “but this is different” and then getting burned.
and if you allow any reference class to be on as strong a footing as very well established reference classes, then you open yourself up to getting pwned ideologically. “all complex intricate objects we have seen created have been created by something intelligent, therefore the universe must also have an intelligent creator.” it’s a very important memetic defense mechanism.
(to be clear this doesn’t mean you can only believe things others believe, or that humans taking over earth is not important evidence, or that doomsday is impossible!! I personally think AGI will probably kill everyone. but this is a big claim and should be treated as such. if we don’t accept this, then we will forever fail to communicate with people who don’t already agree with us on AGI x-risk.)
This all seems wrongheaded to me.
I endeavor to look at how things work and describe them accurately. Similarly to how I try to describe how a piece of code works, or how to to build a shed, I will try to accurately describe the consequences of large machine learning runs, which can include human extinction.
This isn’t how I think about things. Reality is what exists, and if a claim accurately describes reality, then I should not want to hold it to higher standards than claims that do not describe reality. I don’t think it’s a good epistemology to rank claims by “bigness” and then say that the big ones are less likely and need more evidence. On the contrary, I think it’s worth investing more in finding out if they’re right, and generally worth bringing them up to consideration with less evidence than for “small” claims.
I don’t have the experiences you’re describing. I don’t go to churches, I don’t visit cults, I was 3yrs old in the year 2000, I was 11 for the ’08 financial crash and having read about it as an adult I don’t recall extinction being a topic of discussion, I think I have heard of climate people saying that via alarmist news headlines but I have not had anyone personally try to convince me of this or even say that they believe it. I have heard it discussed for nuclear winter, yes, and I think nukes are quite scary and it was reasonable to consider, I did not dismiss it out of hand and wouldn’t use that heuristic. I don’t know what the oil thing is.
In other words, I don’t recall anyone seriously trying to convince me that the world was ending except in cases where they had good reason to believe it. In my life, when people try to warn me about big things, especially if they’ve given it serious thought, usually I’ve found it’s been worthwhile for me to consider it. (I like to think I am good at steering clear of scammers and cranks, so that I can trust the people in my life when they tell me things.)
The sense I get from this post is that, in it, you’re assuming everyone else in the world is constantly being assaulted with claims meant to scare and control them rather than people attempting to describe the world accurately. I agree there are forces doing that, but I think this post gives up all too quickly on there being other forces in the world that aren’t doing that that people can recognize and trust.
i am also trying to accurately describe reality. what i’m saying is, even from the perspective of someone smart and truth-seeking but who doesn’t know much about the object-level, it is very reasonable to use bigness of claim as a heuristic for how much evidence you need before you’re satisfied, and that if you don’t do this, you will be worse at finding the truth in practice. my guess is this applies even more so to the average person.
i think this is very analogous to occam’s razor / trust region optimization. clearly, we need to discount theories based on complexity because there are exponentially more complex theories compared to simple ones, many of which have no easily observable difference to the simpler ones, opening you up to being pwned. and empirically it seems a good heuristic to live life by. complex theories can still be true! but given two theories that both accurately describe reality, you want the simpler one. similarly, given two equally complex claims that accurately describe the evidence, you want the one that is less far fetched from your current understanding of the world / requires changing less of your worldview.
also, it doesn’t have to be something you literally personally experienced. it’s totally valid to read the wikipedia page on the branch davidians or whatever and feel slightly less inclined to take things that have similar vibes seriously, or even to absorb the vibe from your environs (your aversion to scammers and cranks surely did not come ex nihilo, right?)
for most of the examples i raised, i didn’t necessarily mean the claim was literally 100% human extinction, and i don’t think it matters that it wasn’t. first, because the important thing is the vibe of the claim (catastrophic) - since we’re talking about heuristics on how seriously to take things that you don’t have time to deep dive on, the rule has to be relatively cheap to implement. i think most people, even quite smart people, genuinely don’t feel much of an emotional difference between literal human extinction vs collapse of society vs half of people dying painfully, unless they first spend a half hour carefully thinking about the implications of extinction. (and even then depending on their values they may still not feel a huge difference)
also, it would be really bad if you could weasel your way out of a reference class that easily; it would be rife for abuse by bad actors—“see, our weird sect of christianity claims that after armageddon, not only will all actual sinners’ souls be tortured forever, but that the devil will create every possible sinner’s soul to torture forever! this is actually fundamentally different from all existing christian theories, and it would be unfathomably worse, so it really shouldn’t be thought of as the same kind of claim”
even if most people are trying to describe the world accurately (which i think is not true and we only get this impression because we live in a strange bubble of very truth seeking people + are above-average capable at understanding things object level and therefore quickly detecting scams), ideas are still selected for memeticness. i’m sure that 90% of conspiracy theorists genuinely believe that humanity is controlled by lizards and are trying their best to spread what they believe to be true. many (not all) of the worst atrocities in history have been committed by people who genuinely thought they were on the side of truth and good.
(actually, i think people do get pwned all the time, even in our circles. rationalists are probably more likely than average (controlling for intelligence) to get sucked into obviously culty things (e.g zizians), largely because they don’t have the memetic antibodies needed to not get pwned, for one reason or another. so probably many rationalists would benefit from evaluating things a little bit more on vibes/bigness and a little bit less on object level)
Your points about Occam’s razor have got nothing to do with this subject[1]. The heuristic “be more skeptical of claims that would have big implications if true” makes sense only when you suspect a claim may have been adversarially optimized for memetic fitness; it is not otherwise true that “a claim that something really bad is going to happen is fundamentally less likely to be true than other claims”.
I’m having a little trouble connecting your various points back to your opening paragraph, which is the primary thing that I am trying to push back on.[2]
To restate the message I’m reading here: “Give up on having a conversation where you evaluate the evidence alongside your interlocutors. Instead frame yourself as trying to convince them of something, and assume that they are correct to treat your communications as though you are adversarially optimizing for them believing whatever you want them to believe.” This assumption seems to give up a lot of my ability to communicate with people (almost ~all of it), and I refuse to simply do it because some amount of communication in the world is adversarially optimized, and I’m definitely not going to do it because of a spurious argument that Occam’s razor implies that “claims about things being really bad or claims that imply you need to take action are fundamentally less likely to be true”.
You are often in an environment where people are trying to use language to describe reality, and in that situation the primary thing to evaluate is not the “bigness” of a claim, but the evidence for and against it. I recommend instead to act in such a way as to increase the size and occurrence of that environment more-so than “act as though it’s correct to expect maximum adversarial optimization in communications”.
(Meta: The only literal quotes of Leo’s in this comment are the big one in the quote block, my use of “” is to hold a sentence as object, they are not things Leo wrote.)
I agree that the more strongly a claim implies that you should take action, then the more you should consider that it is being optimized adversarially for you to take action. For what it’s worth, I think that heuristic applies more so to claims that you should personally take action. Most people have little action to directly prevent the end of the world from AI; this is a heuristic more naturally applied to claims that you need to pay fines (which are often scams/spam). But mostly, when people give me claims that imply action, they are honestly meant claims and I do the action. This is the vast majority of my experience.
Aside to Leo: Rather than reply point-by-point to the each of the paragraphs in the second comment, I will try restating and responding to the core message I got in the opening paragraph of the first comment. I’m doing this because the paragraphs in the second-comment seemed somewhat distantly related / I couldn’t tell whether the points were actually cruxy. They were responding to many different things, and I hope restating the core thing will better respond to your core point. However I don’t mean to avoid key arguments, if you think I have done so feel free to tell me one or two paragraphs you would especially like me to engage with and I will do so in any future reply.
in practice many of the claims you hear will be optimized for memetic fitness, even if the people making the claims are genuine. well intentioned people can still be naive, or have blind spots, or be ideologically captured.
also, presumably the people you are trying to convince are on average less surrounded by truth seeking people than you are (because being in the alignment community is strongly correlated with caring about seeking truth).
i don’t think this gives up your ability to communicate with people. you simply have to signal in some credible way that you are not only well intentioned but also not merely the carrier of some very memetic idea that slipped past your antibodies. there are many ways to accomplish this. for example, you can build up a reputation of being very scrupulous and unmindkilled. this lets you convey ideas freely to other people in your circles that are also very scrupulous and unmindkilled. when interacting with people outside this circle, for whom this form of reputation is illegible, you need to find something else. depending on who you’re talking to and what kinds of things they take seriously, this could be leaning on the credibility of someone like geoff hinton, or of sam/demis/dario, or the UK government, or whatever.
this might already be what you’re doing, in which case there’s no disagreement between us.
You’re writing lots of things here but as far as I can tell you aren’t defending your opening statement, which I believe is mistaken.
Firstly, it’s just not more reasonable. When you ask yourself “Is a machine learning run going to lead to human extinction?” you should not first say “How trustworthy are people who have historically claimed the world is ending?”, you should of course primarily bring your attention to questions about what sorts of machine is being built, what sort of thinking capacities it has, what sorts of actions it can take in the world, what sorts of optimization it runs, how it would behave around humans if it were more powerful than them, and so on. We can go back to discussing epistemology 101 if need be (e.g. “Hug the Query!”).
Secondly, insofar as someone believes you are a huckster or a crackpot, you should leave the conversation, communication here has broken down and you should look for other communication opportunities. However, insofar as someone is only evaluating this tentatively as one of many possible hypotheses about you then you should open yourself up to auditing / questioning by them about why you believe what you believe and your past history and your memetic influences. Being frank is the only way through this! But you shouldn’t say to them “Actually, I think you should treat me like a huckster/scammer/serf-of-a-corrupt-empire.” This feels analogous to a man on a date with a woman saying “Actually I think you should strongly privilege the hypothesis that I am willing to rape you, and now I’ll try to provide evidence for you that this is not true.” It would be genuinely a bad sign about a man that he thinks that about himself, and also he has moved the situation into a much more adversarial frame.
I suspect you could write some more narrow quick-take such as “Here is some communication advice I find helpful when talking with friends and colleagues about how AI can lead to human extinction”, but in generalizing it all the way to making dictates about basic epistemology you are making basic mistakes and getting it wrong.
Please either (1) defend and/or clarify the original statement, or (2) concede that it was mistaken, rather than writing more semi-related paragraphs about memetic immune systems.
I am confused why you think my claims are only semi related. to me my claim is very straightforward, and the things i’m saying are straightforwardly converying a world model that seems to me to explain why i believe my claim. i’m trying to explain in good faith, not trying to say random things. i’m claiming a theory of how people parse information, to justify my opening statement, which i can clarify as:
sometimes, people use the rhetorical move of saying something like “people think 95% doom is overconfident, yet 5% isn’t. but that’s also being 95% confident in not-doom, and yet they don’t consider that overconfident. curious.” followed by “well actually, it’s only a big claim under your reference class. under mine, i.e the set of all instances of a more intelligent thing emerging, actually, 95% doom is less overconfident than 5% doom” this post was inspired by seeing one such tweet, but i see such claims like this every once in a while that play reference class tennis.
i think this kind of argument is really bad at persuading people who don’t already agree (from empirical observation). my opening statement is saying “please stop doing this, if you do it, and thank you for not doing this, if you dont already do it” the rest of my paragraphs provide an explanation of my theory for why this is bad for changing people’s minds. this seems pretty obviously relevant for justifying why we should stop doing the thing. i sometimes see people out there talk like this (including my past self at some point), and then fail to convince people, and then feel very confused about why people don’t see the error of their ways when presented with an alternative reference class. if my theory is correct (maybe it isn’t, this isn’t a super well thought out take, it’s more a shower thought), then it would explain this, and people who are failing to convince people would probably want to know why they’re failing. i did not spell this out in my opening statement because i thought it was clear but in retrospect this was not clear from the opening statement
i don’t think the root cause is people being irrational epistemically. i think there is a fundamental reason why people do this that is very reasonable. i think you disagree with this on the object level and many of my paragraphs are attempting to respond to what i view as the reason you disagree. this does not explicitly show up in the opening statement, but since you disagree with this, i thought it would make sense to respond to that too
i am not saying you should explicitly say “yeah i think you should treat me as a scammer until i prove otherwise”! i am also not saying you should try to argue with people who have already stopped listening to you because they think you’re a scammer! i am merely saying we should be aware that people might be entertaining that as a hypothesis, and if you try to argue by using this particular class of rhetorical move, you will only trigger their defenses further, and that you should instead just directly provide the evidence for why you should be taken seriously, in a socially appropriate manner. if i understand correctly, i think the thing you are saying one should do is the same as the thing i’m saying one should do, but phrased in a different way; i’m saying not to do a thing that you seem to already not be doing.
i think i have not communicated myself well in this conversation, and my mental model is that we aren’t really making progress, and therefore this conversation has not brought value and joy into the world in the way i intended. so this will probably be my last reply, unless you think doing so would be a grave error.
Thank you for all this. I still think your quick take is wrong on the matter of epistemology.
I acknowledge that you make a fine point about persuasion, that someone who is primarily running the heuristic that “claims about the end of the world are probably crack-pots or scammers” will not be persuaded by someone arguing that actually 20:1 against and 20:1 in favor of a claim are equally extreme beliefs.
A version of the quick take that I would’ve felt was just fine would read:
But your quick take doesn’t confine itself to discussing those people in those situations. It flatly says it’s true as a matter of epistemology that you should “use bigness of claim as a heuristic for how much evidence you need before you’re satisfied”, that you should “use reference classes that have consistently made good decisions irl” and that the crackpots/scammers one is the correct one to use here otherwise you’ll risk “getting pwned ideologically”.
These aren’t always the right heuristics (e.g. on this issue they are not for you and for me) and you shouldn’t say that they are just so that some people on Twitter will stop using rhetoric that isn’t working.
I believe you’re trying to do your best to empathize with people who are unpersuaded by an unsuccessful rhetorical move, a move that people who believe your position are making in public discourse. That is commendable. I think you are attempting to cause other people who hold your position to stop using that rhetorical move, by telling them off for using it, but to acheive this aim you are repeatedly saying the people who do not hold your position are doing normatively correct epistemology, and you’re justifying it with Occam’s razor and reference class forecasting, and this is all wrong. In some situations for some people it is reasonable to primarily use theses heuristics, and in other situations for other people it is not. I’m not arguing that the people unpersuaded are being unreasonable, but (for example) your opening sentence makes fully-general statements about how to reason about this issue that I believe are false. Rule number of one of good discourse: don’t make false statements about epistemology in order to win an object level point.
Yep, seems fine to drop this here; I make no bid of you to reply further.
(I would never make knowingly false statements about epistemology to try to win an object level point; I still disagree with your claims about epistemology and believe that my epistemology arguments are in good faith and capture truth in some way. This disagreement might be because I’ve not communicated myself well. I originally wasn’t going to reply but I felt the need to say this because your comment can be viewed as accusing me of intellectual/epistemic dishonesty, even if that wasn’t your intention.)
(I affirm that I don’t believe you were being knowingly dishonest or deceptive at any point in this thread.)
But you should absolutely ask “does it look like I’m making the same mistakes they did, and how would I notice if it were so?” Sometimes one is indeed in a cult with your methods of reason subverted, or having a psychotic break, or captured by a content filter that hides the counterevidence, or many of the more mundane and pervasive failures in kind.
But not in full generality! This is a fine question to raise in this context, but in general the correct thing to do in basically all situations is to consider the object level, and then also let yourself notice if people are unusually insane around a subject, or insane for a particular reason. Sometimes that is the decisive factor, but for all questions, the best first pass is to think about how that part of the world works, rather than to think about the other monkeys who have talked about it in the past.
This seems wrong to me.
a. More smaller things happen and there are fewer kinds of smaller thing that happen.
b. I bet people genuinely have more evidence for small claims they state than big ones on average.
c. The skepticism you should have because particular claims are frequently adversarially generated shouldn’t first depend on deciding to be skeptical about it.
If you’ll forgive the lack of charity, ISTM that leogao is making IMO largely true points about the reference class and then doing the wrong thing with those points, and you’re reacting to the thing being done wrong at the end, but trying to do this in part by disagreeing with the points being made about the reference class. leogao is right that people are reasonable in being skeptical of this class of claims on priors, and right that when communicating with someone it’s often best to start within their framing. You are right that regardless it’s still correct to evaluate the sum of evidence for and against a proposition, and that other people failing to communicate honestly in this reference class doesn’t mean we ought to throw out or stop contributing to the good faith conversations avaialable to us.
i’m not even saying people should not evaluate evidence for and against a proposition in general! it’s just that this is expensive, and so it is perfectly reasonable to have heuristics to decide which things to evaluate, and so you should first prove with costly signals that you are not pwning them, and then they can weigh the evidence. and until you can provide enough evidence that you’re not pwning them for it to be worth their time to evaluate your claims in detail, that it should not be surprising that many people won’t listen to the evidence; and that even if they do listen, if there is still lingering suspicion that they are being pwned, you need to provide the type of evidence that could persuade someone that they aren’t getting pwned (for which being credibly very honest and truth seeking is necessary but not sufficient), which is sometimes different from mere compellingness of argument
I think the framing that sits better to me is ‘You should meet people where they’re at.’ If they seem like they need confidence that you’re arguing from a place of reason, that’s probably indeed the place to start.
Thanks for the comment. (Upvoted.)
a. I expect there is a slightly more complicated relationship between my value-function and the likely configuration states of the universe than literally zero-correlation, but most configuration states do not support life and we are all dead, so in one sense a claim that in the future something very big and bad will happen is far more likely on priors. One might counter that we live in a highly optimized society where things being functional and maintained is an equilibrium state and it’s unlikely for systems to get out of whack enough for bad things to happen. But taking this straightforwardly is extremely naive, tons of bad things happen all the time to people. I’m not sure whether to focus on ‘big’ or ‘bad’ but either way, the human sense of these is not what the physical universe is made out of or cares about, and so this looks like an unproductive heuristic to me.
b. On the other hand, I suspect the bigger claims are more worth investing time to find out if they’re true! All of this seems too coarse-grained to produce a strong baseline belief about big claims or small claims.
c. I don’t get this one. I’m pretty sure I said that if you believe that you’re in a highly adversarial epistemic environment, then you should become more distrusting of evidence about memetically fit claims.
I don’t know what true points you think Leo is making about “the reference class”, nor which points you think I’m inaccurately pushing back on that are true about “the reference class” but not true of me. Going with the standard rationalist advice, I encourage everyone to taboo “reference class” and replace it with a specific heuristic. It seems to me that “reference class” is pretending that these groupings are more well-defined than they are.
Well, sure, it’s just you seemed to frame this as a binary on/off thing, sometimes you’re exposed and need to count it and sometimes you’re not, whereas to me it’s basically never implausible that a belief has been exposed to selection pressures, and the question is of probabilities and degrees.
I think you’re correct. There’s a synergistic feedback loop between alarmism and social interaction that filters out pragmatic perspectives. Creating the illusion that the doom surrounding any given topic more prevalent than it really is, or even that it’s near universal.
Even before the rise of digital information the feedback phenomenon could be observed in any insular group. In today’s environment where a lot of effort goes into exploiting that feedback loop it requires a conscious effort to maintain perspective, or even remain aware that there are other perspectives.
I think the group of people “claiming the end of the world” in the case of AI x-risk is importantly more credentialed and reasonable-looking than most prior claims about the end of the world. From the reference class and general heuristics perspective that you’re talking about[1], I think how credible looking the people are is pretty important.
So, I think the reference class is more like claims of nuclear armageddon than cults. (Plausibly near maximally alarmist climate people are in a similar reference class.)
IDK how I feel about this perspective overall.
I agree this reference class is better, and implies a higher prior, but I think it’s reasonable for the prior over “arbitrary credentialed people warning about something” to be still relatively low in an absolute sense- lots of people have impressive sounding credentials that are not actually good evidence of competence (consider: it’s basically a meme at this point that whenever you see a book where the author puts “PhD” after their name, they probably are a grifter / their phd was probably kinda bs), and also there is a real negativity bias where fearmongering is amplified by both legacy and social media. Also, for the purposes of understanding normal people, it’s useful to keep in mind that trust in credentials and institutions is not very high right now in the US among genpop.
You shouldn’t. This epistemic bath has no baby in it and we should throw water out of it.
This is kind of missing the point of Bayes. One shouldn’t “choose” a reference class to update on. One should update to the best of your ability on the whole distribution of hypotheses available to describe the situation. Neither is a ‘right’ or ‘wrong’ reference class to use, they’re both just valid pieces of evidence about base rates, and you should probably be using both of them.
It seems you are having in mind something like inference to the best explanation here. Bayesian updating, on the other hand, does need a prior distribution, and the question of which prior distribution to use cannot be waved away when there is a disagreement on how to update. In fact, that’s one of the main problems of Bayesian updating, and the reason why it is often not used in arguments.
I’m not really sure what that has to do with my comment. My point is the original post seemed to be operating as if you look for the argmax reference class, you start there, and then you allow arguments. My point isn’t that their prior is wrong, it’s that this whole operation is wrong.
I think also you’re maybe assuming I’m saying the prior looks something like {reference class A, reference class B} and arguing about the relative probability of each, but it doesn’t, a prior should be over all valid explanations of the prior evidence. Reference classes come in because they’re evidence about base rates of particular causal structures; you can say ‘given the propensity for the world to look this way, how should I be correcting the probability of the hypotheses under consideration? Which new hypotheses should I be explicitly tracking?’
I can see where the original post might have gone astray. People have limits on what they can think about and it’s normal to narrow one’s consideration to the top most likely hypothesis. But it’s important to be aware of what you’re approximating here, else you get into a confusion where you have two valid reference classes and you start telling people that there’s a correct one to start arguing from.
… but that still leaves the problem of which prior distribution should be used.
I agree this is an interesting philosophical question but again I’m not sure why you’re bringing it up.
Given your link maybe you think me mentioning Bayes was referring to some method of selecting a single final hypothesis? I’m not, I’m using it to refer to the Bayesian update rule.
It seems the updating rule doesn’t tell you anything about the original argument even when you view information about reference classes as evidence rather than as a method of assigning prior probabilities to hypotheses. Or does it? Can you rephrase the argument in a proper Bayesian way such that it becomes clearer? Note that how strongly some evidence confirms or disconfirms a hypothesis also depends on a prior.
What argument are you referring to when you say “doesn’t tell you anything about the original argument”?
My framing is basically this: you generally don’t start a conversation with someone as a blank pre-priors slate that you get to inject your priors into. The prior is what you get handed, and then the question is how people should respond to the evidence and arguments available. Well, you should use (read: approximate) the basic Bayesian update rule: hypotheses where an observation is unlikely are that much less probable.
I meant leogao’s argument above.
I think you’re underestimating the inferential gap here. I’m not sure why you’d think the Bayes updating rule is meant to “tell you anything about” the original post. My claim was that the whole proposal about selecting reference classes was framed badly and you should just do (approximate) Bayes instead.
And what would this look like? Can you reframe the original argument accordingly?
It’s just Bayes, but I’ll give it a shot.
You’re having a conversation with someone. They believe certain things are more probable than other things. They mention a reference class: if you look at this grouping of claims, most of them are wrong. Then you consider the set of hypotheses: under each of them, how plausible is it given the noted tendency for this grouping of claims to be wrong? Some of them pass easily, eg. the hypothesis that this is just another such claim. Some of them less easily; they are either a modal part of this group and uncommon on base rate, or else nonmodal or not part of the group at all. You continue, with maybe a different reference class, or an observation about the scenario.
Hopefully this illustrates the point. Reference classes are just evidence about the world. There’s no special operation needed for them.
False?
Climate change tail scenarios are worth studying and averting. Nuclear winter was obviously worth studying and averting back in the Cold War, and still is today. 2008 financial crisis was worth studying and averting.
Do you not believe average citizens can study issues like these and make moves to solve them?
The reference classes you should use work as a heuristic because there is some underlying mechanism that makes them work. So you should use reference classes in situations where their underlying mechanism is expected to work.
Maybe the underlying mechanism of doomsday predictions not working is that people predicting doom don’t make their predictions based on valid reasoning. So if someone uses that reference class to doubt AI risk, this should be judged as them making a claim about reasoning of people predicting AI doom being similar to people in cults predicting Armageddon.