# [Question] Is LessWrong dead without Cox’s theorem?

Cox’s theorem seems to be pretty important to you guys but it’s looking kind of weak right now with Halpern’s counter-example so I was wondering: what implications does Cox’s theorem not being true have for LessWrong? There seem to be very few discussions in LessWrong about alternative formulations for fixing probability theory as extended logic in light of Halpern’s paper. I find this quite surprising given how much you all talk about Jaynes-Cox probability theory. I asked a question about it myself, but to no avail: https://​​www.lesswrong.com/​​posts/​​x7NyhgenYe4zAQ4Kc/​​has-van-horn-fixed-cox-s-theorem

Thanks!

• The general “method of rationality” does not require any specific theorem to be true. Rationality will work so long as the universe has causality. All rationality is saying that, given actions an agent can take have some causal effect on the outcome the universe will take, the agent can estimate the optimal outcome for the agent’s goals. And the agent should do that by definition as this is what “winning” is.

We have many demonstrated such agents today, from simple control systems to cutting edge deep learning game-players. And we as humans should aspire to act as rationally as we can.

This is where non-mainstream actions come into play, for example cryonics is rational, taking risky drugs that may slow aging is rational, and so on. This is because the case for them is so strong that any rational approximation of the outcome of your actions says you should be doing these things. Another bit of non-mainstream thought is we don’t have to be certain of an outcome to chase it. For example, if cryonics has a 1% chance of working, mainstream thought says we should just take the 99% case of it failing as THE expected outcome, declare it “doesn’t work”, and not do it. But a 1% chance of not being dead is worth the expense for most people.

No theorems are required, only that the laws of physics allow for us to rationally compute what to do. [note that religious beliefs state the opposite of this. For example, were an invisible being pulling the strings of reality, then merely “thinking” in a way that being doesn’t like might cause that being to give you bad outcomes. mainstream religions contain various “hostile to rationality” memes, some religions state you should stop thinking, others that you should “take it on faith” that everything your local church leader states is factual, and so on.]

• The general “method of rationality” does not require any specific theorem to be true. Rationality will work so long as the universe has causality. All rationality is saying that, given actions an agent can take have some causal effect on the outcome the universe will take, the agent can estimate the optimal outcome for the agent’s goals. And the agent should do that by definition as this is what “winning” is

And the agent can learn to do that better. In a universe where intuition and practical experience beat explicit reasoning, there is no point in teaching or learnjng rationality. So there are actually a couple of further assumptions.

This is because the case for them is so strong that any rational approximation of the outcome of your actions says you should be doing these things. Another bit of non-mainstream thought is we don’t have to be certain of an outcome to chase it. For example, if cryonics has a 1% chance of working, mainstream thought says we should just take the 99% case of it failing as THE expected outcome, declare it “doesn’t work”, and not do it. But a 1% chance of not being dead is worth the expense for most people.

Which also requires some unstated assumptions. You are assuming that it is a win to merely be successfully revivified from cryonics , but you would also need to assume that you are not in a hell world, and even if you are not , that you would be happy having no social ties. Also there are trade offs between the amount of money spent on cryonics, and leaving money to your descendants. It actually takes a rather unusual person, a person who who has a lot of spare money and few social connections, to value cryonics.

• Or normal people are just wrong.

This is one of the tenants of rationality. If the best information available and the best method for assessing probability clear says something different than “mainstream” opinions, then probably the mainstream is simply wrong.

During the pandemic there were many examples of this, since modeling an exponential process is something that is easy to do with math but mainstream decision makers often failed to follow the predictions, using usually linear models or incorrect “Intuition and practical experience”.

As a side note there’s many famous examples where this fails, usually intuition or practical experience fails when contrasted with well collected large scale data. I should say it technically always fails. Another element of rationality is it’s not enough to be right you have to have made the right conclusion. As an example it is incorrect to hit on 20 on black jack even if you win the hand you are still wrong to do it, unless you have a way of seeing the next card.

(Or in more explicit terms the policy you use needs to be evidence based and the best available, and it’s effectiveness measured over large data sets not local and immediate term outcomes. This means that sometimes having the best chance of winning means you lose)

As for a “hell world”, known human history has had very few humans living in “hell” conditions for long. And someone can make new friends and family. So these objections are not rational.

• Or normal people are just wrong.

Wrong about their values, or wrong about the actions they should take to maximize their values? Is it inconceivable that someone with strong preferences for maintaining their social connections, etc., could correctly reject cryonics?

As for a “hell world”, known human history has had very few humans living in “hell” conditions for long.

But you can still have a preference for experiencing zero torture.

• Wrong about the actions they should take to maximize their values.

It’s inconceivable because it’s a failure of imagination. Someone who has many social connections now will potentially able to make many new ones then were they to survive cryo. Moreover reflecting on past successes requires one to still exist to remember

Could a human exist that should rationally say no to cryo? In theory yes but probably none have ever existed. As long as someone extracts any positive utility at all from a future day of existing then continuing to exist is better than death. And while yes certain humans live in chronic pain any technology able to rebuild a cryo patient can almost certainly fix the problem causing it.

• Waking from cryo is equivalent to exile. Exile is a punishment.

• Yes. Doesn’t matter though.

Could a human exist that should rationally say no to cryo? In theory yes but probably none have ever existed. As long as someone extracts any positive utility at all from a future day of existing then continuing to exist is better than death. And while yes certain humans live in chronic pain any technology able to rebuild a cryo patient can almost certainly fix the problem causing it.

You need to say our of 100 billion humans someone lived who has a problem that can’t be fixed that suffers more existing than not. This is a paradox and I say none exist as all problems are brain or body faults that can be fixed.

• As long as someone extracts any positive utility at all from a future day of existing then continuing to exist is better than death

You are assuming selfishness. A person has to trade off the cost of cryo against the benefits of leaving money to their family, or charity.

And while yes certain humans live in chronic pain any technology able to rebuild a cryo patient can almost certainly fix the problem causing it.

Now assuming benevolent motivations.

• No, Less Wrong is probably not dead without Cox’s theorem, for several reasons.

It might turn out that the way Cox’s theorem is wrong is that the requirements it imposes for a minimally-reasonable belief system need strengthening, but in ways that we would regard as reasonable. In that case there would still be a theorem along the lines of “any reasonable way of structuring your beliefs is equivalent to probability theory with Bayesian updates”.

Or it might turn out that there are non-probabilistic belief structures that are good, but that they can be approximated arbitrarily closely with probabilistic ones. In that case, again, the LW approach would be fine.

Or it might turn out probabilistic belief structures are best so long as the actual world isn’t too crazy. (Maybe there are possible worlds where some malign entity is manipulating the evidence you get to see for particular goals, and in some such worlds probabilistic belief structures are bad somehow.) In that case, we might know that either the LW approach is fine or the world is weird in a way we don’t have any good way of dealing with.

Alternatively, it might happen that Cox’s theorem is wronger than that; that there are human-compatible belief structures that are, in plausible actual worlds, genuinely substantially different from probabilities-and-Bayesian-updates. Would LW be dead then? Not necessarily.

It might turn out that all we have is an existence theorem and we have no idea what those other belief structures might be. Until such time as we figure them out, probability-and-Bayes would still be the best we know how to do. (In this case I would expect at least some LessWrongers to be working excitedly on trying to figure out what other belief structures might work well.)

It might turn out that for some reason the non-probabilistic belief structures aren’t interesting to us. (E.g., maybe there are exceptions that in some sense amount to giving up and saying “I dunno” to everything.) In that case, again, we might need to adjust our ideas a bit but I would expect most of them to survive.

Suppose none of those things is the case: Cox’s theorem is badly, badly wrong; there are other quite different ways in which beliefs can be organized and updated, that are feasible for humans to practice and don’t look at all like probabilities+Bayes, and that so far as we can see work just as well or better. That would be super-exciting news. It might require a lot of revision of ideas that have been taken for granted here. I would expect LessWrongers to be working excitedly on figuring out what things need how much revision (or discarding completely). The final result might be that LessWrong is dead, at least in the sense that the ways of thinking that have been common here all turn out to be very badly suboptimal and the right thing is to all convert to Mormonism or something. But I think a much more likely outcome in this scenario is that we find an actually-correct analogue of Cox’s theorem, which tells us different things about what sorts of thinking might be reasonable, and it still involves (for instance) quantifying our degrees of belief somehow, and updating them in the light of new evidence, and applying logical reasoning, and being aware of our own fallibility. We might need to change a lot of things, but it seems pretty likely to me that the community would survive and still be recognizably Less Wrong.

Let me put it all less precisely but more pithily: Imagine some fundamental upheaval in our understanding of mathematics and/​or physics. ZF set theory is inconsistent! The ultimate structure of the physical world is quite unlike the GR-and-QM muddle we’re currently working with! This would be exciting but it wouldn’t make bridges fall down or computers stop computing, and people interested in applying mathematics to reality would go on doing so in something like the same ways as at present. Errors in Cox’s theorem are definitely no more radical than that.

• Or succinctly: to be the “least wrong” you need to be using the measured best available assessment of projected outcomes. All tools available are approximations anyway and the best tools right now are ‘black box’ deep learning methods which we do not know exactly how they arrive at their answers.

This isn’t a religion and this is what a brain or any other known form of intelligence, artificial or natural, does.

• I would expect LessWrongers to be working excitedly on figuring out what things need how much revision (or discarding completely)

I’d expect them to shoot the messenger!

• Because it’s already happening, and that’s what they are doing. I just got two downvotes for pointing it out.

• I didn’t downvote you and don’t claim any unique insight into the motives of whoever did, but I know I did think “that seems a low-effort low-quality comment”, not because I think what you say is untrue (I don’t know whether it is or not) but because you made a broad accusation and didn’t provide any evidence for it. So far as I can tell, the only evidence you’re offering now is that your comment got downvoted, which (see above) has other plausible explanations other than “because LW readers will shoot the messenger”.

The obvious candidate for “the messenger” here would be Haziq Muhammad, but I just checked and every one of his posts and comments has a positive karma score. This doesn’t look like messenger-shooting to me.

What am I (in your opinion) missing here?

• It’s being going on much longer than that.

The classic is :

“Comment author: Eliezer_Yudkowsky 05 September 2013 07:30:56PM 1 point [-]

Warning: Richard Loosemore is a known permanent idiot, ponder carefully before deciding to spend much time arguing with him.”

Richard Loosemore is, in fact, a professional AI researcher.

http://​​www.richardloosemore.com/​​

• So your evidence that “LW readers will shoot the messenger” is that one time Eliezer Yudkowsky called a professional AI researcher a “known permanent idiot”?

This seems very unconvincing. (1) There is no reason why someone couldn’t be both an idiot and a professional AI researcher. (I suspect that Loosemore thinks Yudkowsky is an idiot, and Yudkowsky is also a professional AI researcher, albeit of a somewhat different sort. If either of them is right, then a professional AI researcher is an idiot.) (2) “One leading LW person once called one other person an idiot” isn’t much evidence of a general messenger-shooting tendency, even if the evaluation of that other person as an idiot was 100% wrong.

• So your evidence that “LW readers will shoot the messenger” is that one time Eliezer Yudkowsky called a professional AI researcher a “known permanent idiot”?

here is no reason why someone couldn’t be both an idiot and a professional AI researcher.

In probablistic terms, the person who has all three of qualifications, practical experience and published work, is less likely to be an idiot.

• My evidence for what?

Yes, I agree that AI researchers are less often idiots than randomly chosen people. It’s still possible to be both. For the avoidance of doubt, I’m not claiming that Loosemore is an idiot (even in the rather loose sense that I think EY meant); maybe he is, maybe he isn’t. The possibility that he isn’t is just one of the several degrees of separation between your offered evidence (EY called someone an idiot once) and the claim it seems to be intended to support (LW readers in general will shoot the messenger if someone turns up saying something that challenges their opinions).

• Your evidence for the contrary claim.

The possibility that he isn’t is just one of the several degrees of separation between your offered evidence (EY called someone an idiot once) and the claim it seems to be intended to support

That’s an objection that could be made to anything. There is still no evidence for the contrary claim that lesswrong will abandon long held beliefs quickly and willingly.

• Oh, you mean my claim that if someone comes along with an outright refutation of the idea that belief-structures ought to be probability-like then LWers would be excitedly trying to figure out what they could look like instead?

I’m not, for the avoidance of doubt, making any claims that LWers have particularly great intellectual integrity (maybe they do, maybe not) -- it’s just that this seems like the sort of question that a lot of LWers are very interested in.

I don’t understand what you mean by “That’s an objection that could be made to anything”. You made a claim and offered what purported to be support for it; it seems to me that the purported support is a long way from actually supporting the claim. That’s an objection that can be made to any case where someone claims something and offers only very weak evidence in support of it. I don’t see what’s wrong with that.

I’m not making any general claim that “lesswrong will abandon long held beliefs quickly and willingly”. I don’t think I said anything even slightly resembling that. What I think is that some particular sorts of challenge to LW traditions would likely be very interesting to a bunch of LWers and they’d likely want to investigate.

• an outright refutation

Who gets to decide what’s outright?Reality isn’t a system where objective knowledge just pops up in people’s brains , it’s a system where people exchange arguments , facts and opinions , and may or may not change their minds.

There are still holdouts against evolution,relativity, quantum, climate change, etc. As you know. And it seems to them ..it seems to them that they are being objective and reasonable.

From the outside, they are biased towards tribal beliefs. How do you show that someone is not? Not having epistemic double standards would be a good start ..

• I entirely agree that it’s possible that someone might come along with something that is in fact a refutation of the idea that a reasonable set of requirements for rational thinking implies doing something close to probability-plus-Bayesian-updating, but that some people who are attached to that idea don’t see it as a refutation.

I’m not sure whether you think that I’m denying that (and that I’m arguing that if someone comes along with something that is in fact a refutation, everyone on LW will necessarily recognize it as such), or whether you think it’s an issue that hasn’t occurred to me; neither is the case. But my guess—which is only a guess, and I’m not sure what concrete evidence one could possibly have for it—is that in most such scenarios at least some LWers would be (1) interested and (2) not dismissive.

I guess we could get some evidence by looking at how similar things have been treated here. The difficulty is that so far as I can tell there hasn’t been anything that quite matches. So e.g. there’s this business about Halpern’s counterexample to Cox; this seems to me like it’s a technical issue, to be addressed by tweaking the details of the hypotheses, and the counterexample is rather far removed from the realities we care about. The reaction here has been much more “meh” than “kill the heretic”, so far as I can tell. There’s the fact that some bits of the heuristics-and-biases stuff that e.g. the Sequences talk a lot about now seem doubtful because it turns out that psychology is hard and lots of studies are wrong (or, in some cases, outright fraudulent); but I don’t think much of importance hangs on exactly what cognitive biases humans have, and in any case this is a thing that some LW types have written about, in what doesn’t look to me at all a shoot-the-messenger sort of way.

Maybe you have a few concrete examples of messenger-shooting that are better explained as hostile reaction to evidence of being wrong rather than as hostile reaction to actual attack? (The qualificatoin is because if you come here and say “hahaha, you’re all morons; here’s my refutation of one of your core ideas” then, indeed, you will likely get a hostile response, but that’s not messenger-shooting as I would understand it.)

I heartily agree that having epistemic double standards is very bad. I have the impression that your comment is intended as an accusation of epistemic double standards, but I don’t know whom you’re accusing of exactly what epistemic double standards. Care to be more specific?

• Maybe you have a few concrete examples of messenger-shooting that are better explained as hostile reaction to evidence of being wrong rather than as hostile reaction to actual attack?

Better explained in whose opinion? Comfirmation bias will make you see neutral criticism as stack, because that gives you areason to reject it.

• Better explained in your opinion, since I’m asking you to give some examples.

Obviously it’s possible that you’ll think something is a neutral presentation of evidence that something’s wrong, and I’ll think it’s an attack. Or that you’ll think it’s a watertight refutation of something, and I won’t. Etc. Those things could happen if I’m too favourably disposed to the LW community or its ideas, or if you’re too unfavourably disposed, or both. In that case, maybe we can look at the details of the specific case and come to some sort of agreement.

If you’ve already decided in advance that if something’s neutral then I’ll see it as an attack … well, then which of us is having trouble with confirmation bias in that scenario?

• Anyone can suffer from confirmation bias.

How can you tell you’re not?

Here’s a question: where are the errata? Why has lesswrong never officially changed its mind about anything?

• I don’t understand your first question. I can’t tell that I’m not, because (as you say) it’s possible that I am. Did I say something that looked like “I know that I am not in any way suffering from confirmation bias”? Because I’m pretty sure I didn’t mean to.

Also, not suffering from confirmation bias (in general, or on any particular point) is a difficult sort of thing to get concrete evidence of. In a world where I have no confirmation bias at all regarding some belief of mine, I don’t think I would expect to have any evidence of that that I could point to.

What official LW positions would you expect there to be errata for?

(Individual posts on LW sometimes get retracted or corrected or whatever: see e.g. “Industry Matters 2: Partial Retraction” where Sarah Constantin says that a previous post of hers was wrong about a bunch of things, or “Using the universal prior for logical uncertainty (retracted)” where cousin_it proposed something and retracted it when someone found an error. I don’t know whether Scott Alexander is LW-adjacent enough to be relevant in your mind, but he has a page of notable mistakes he’s made. But it sounds as if you’re looking more specifically for cases where the LW community has strongly committed itself to a particular position and then officially decided that that was a mistake. I don’t know of any such cases, but it’s not obvious to me why there should be any. Where are your errata in that sense? Where are (say) Richard Feynman’s? If you have in mind some concrete examples where LW should have errata, they might be interesting.)

• Where are (say) Richard Feynman’s?

Good grief… academics revise and retract things all the times. The very word “errata” comes from.the world of academic publishing!

If you have in mind some concrete examples where LW should have errata, they might be interesting.)

• Yup, academics revise and retract things. So, where are Richard Feynman’s errata? Show me.

The answer, I think, is that there isn’t a single answer to that question. Presumably there are some individual mistakes which he corrected (though I don’t know of, e.g., any papers that he retracted) -- the analogues of the individual posts I listed a few of above. But I don’t know of any case where he said “whoops, I was completely wrong about something fundamental”, and if you open any of his books I don’t think you’ll find any prominent list of mistakes or anything like that.

As you say, science is noted for having very good practices around admitting and fixing mistakes. Feynman is noted for having been a very good scientist. So show me how he meets your challenge better than Less Wrong does.

No, you haven’t “already told me” concrete examples. You’ve gestured towards a bunch of things you claim have been refuted, but given no details, no links, nothing. You haven’t said what was wrong, or what would have been right instead, or who found the alleged mistakes, or how the LW community reacted, or anything.

Unless I missed it, of course. That’s always possible. Got a link?

• Scott Alexander is LW-adjacent enough to be relevant in your mind, but he has a page of notable mistakes he’s made.

I am using lesswrong exclusively of the codexes.

• What official LW positions would you expect there to be errata for?

I’m specifically referencing RAZ/​ the Sequences. Maybe theyre objectively perfect, and nothing of significance has happened in ten years..

As I’m forever pointing out, there are good objections to many of the postings in the sequences from well informed people , to be found in the comment s...but no one has admitted that a single one is actually right, no one has attempted to go back and answer them, and they simply disappear from RAZ.

• OK, we have a bit of a move in the direction of actually providing some concrete information here, which is nice, but it’s still super-vague.

Also, your complaint now seems to be entirely different from your original complaint. Before, you were saying that LW should be expected to “shoot the messenger”. Now, you’re saying that LW ignores the messenger. Also bad if true, of course, but it’s an entirely different failure mode.

So, anyway, I thought I’d try a bit of an experiment. I’m picking random articles from the “Original Sequences” (as listed in the LW wiki), then starting reading the comments at a random place and finding the first substantial objection after there (wrapping round to the start if necessary). Let’s see what we find.

• “Timeless Identity”: user poke says EY is attacking a strawman when he points out that fundamental particles aren’t distinguishable, because no one ever thinks that our identity is constituted by the identity of our atoms, because everyone knows that we eat and excrete and so forth.

• In the post itself, EY quotes no less a thinker than Derek Parfit arguing that maybe the difference between “numerical identity” and “qualitative identity” might be significant, so it can’t be that strawy; and the point of the post is not simply to argue against the idea that our identity is constituted by the identity of our atoms. So I rate this objection not terribly strong. It doesn’t seem to have provoked any sort of correction, nor do I see why it should have; but it also doesn’t seem to have provoked any sort of messenger-shooting; it’s sitting at +12.

• “Words as Hidden Inferences”: not much substantive disagreement. Nearest I can find is a couple of complaints from user Caledonian2, both of which I think are merely nitpicks.

• “The Sacred Mundane”: user Capla disagrees with EY’s statement that when you start with religion and take away the concrete errors of fact etc., all you have left is pointless vagueness. No, Capla says, there’s also an urge towards heroic moral goodness, which you don’t really find anywhere else.

• Seems like a reasonable counterpoint. Doesn’t seem like it got much attention, which is a shame; I think there could have been an interesting discussion there.

• “The Sacred Mundane” is one of the posts shown on the “Original Sequences” page in italics to indicate that it’s “relatively unimportant”. I assume it isn’t in RAZ. I doubt that’s because of Capla’s objection.

• “Why Truth?”: not much substantive disagreement.

• “Nonperson Predicates”: not much substantive disagreement, and this feels like a sufficiently peripheral issue anyway that I couldn’t muster much energy to look in detail at the few disagreements there were.

• “The Proper Use of Doubt”: a few people suggest in comments that (contra what EY says in the post) there is potential value even in doubts that never get resolved. I think they’re probably right, but again this is a peripheral point (since I think I agree with EY that the main point of doubting a thing you believe is to prompt you to investigate enough that you either stop believing it or confirm your belief in it) on a rather peripheral post.

Not terribly surprisingly (on either your opinions or mine, I think), this random exploration hasn’t turned up anything with a credible claim to be a demolition of something important to LW thinking. It also hasn’t turned up anything that looks to me like messenger-shooting, or like failing to address important major criticisms, and my guess at this point is that if there are examples of those then finding them is going to need more effort than I want to put in. Especially as you claim you’ve already got those examples! Please, share some of them with me.

• At least I’ve got you thinking.

I previously gave you a short list of key ideas. Auman, Bayes, Solomonoff, and so on.

Now, you’re saying that LW ignores the messenger. Also bad if true, of course, but it’s an entirely different failure mode.

No, it’s not very different. Shooting the messenger, ignoring the messenger, and and quietly updating without admitting it, are all ways that confirmation bias manifests. Aren’t you supposed to know about this stuff?

• Yes, you gave me a “short list of key ideas”. So all I have to do to find out what you’re actually talking about is to go through everything anyone has ever written about those ideas, and find the bits that refute positions widely accepted on Less Wrong.

This is not actually helpful. Especially as nothing you’ve said so far gives me very much confidence that the examples you’re talking about actually exist; one simple explanation for your refusal to provide concrete examples is that you don’t actually have any.

I’ve put substantial time and effort into this discussion. It doesn’t seem to me as if you have the slightest interest in doing likewise; you’re just making accusation after accusation, consistently refusing to provide any details or evidence, completely ignoring anything I say unless it provides an opportunity for another cheap shot, moving the goalposts at every turn.

I don’t know whether you’re actually trolling, or what. But I am not interested in continuing this unless you provide some actual concrete examples to engage with. Do so, and I’ll take a look. But if all you want to do is sneer and whine, I’ve had enough of playing along.

• But my guess—which is only a guess, and I’m not sure what concrete evidence one could possibly have for it—is that in most such scenarios at least some LWers would be (1) interested and (2) not dismissive.

“At least some” is a climbdown. If I were allowed to rewrite my original comment to “at least some lesswrongians would shoot the messenger” , then we would not be in disagreement.

I guess we could get some evidence by looking at how similar things have been treated here. The difficulty is that so far as I can tell there hasn’t been anything that quite matches

Except criticism of the lesswrongian version of Bayes, and the lesswrongian version of Aumann, and the lesswrongian version of Solomonoff, and of the ubiquitous utility function, and the MWI stuff....

but I don’t know whom you’re accusing of exactly what epistemic double standards.

Everyone who thinks I have to support my guess about how lesswrongians would behave with evidence, but isn’t asking for your evidence for your guess.

• If your original comment had said “at least some”, I would have found it more reasonable.

So, anyway, it seems that you think that “the lesswrongian version of Bayes”, and likewise of Aumann, and Solomonoff, and “the ubiquitous utility function”, and “the MWI stuff”, have all been outright refuted, and the denizens of Less Wrong have responded by shooting the messenger. (At least, I don’t know how else to interpret your second paragraph.)

Could you maybe give a couple of links, so that I can see these refutations and this messenger-shooting?

(I hold no particular brief for “the lesswrong version of” any of those things, not least because I’m not sure exactly what it is in each case. Something more concrete might help to clarify that, too.)

I think ChristianKl is correct to say that lazy praise is better (because less likely to provoke defensiveness, acrimony, etc.) than lazy insult. I also think “LW people will respond to an interesting mathematical question about the foundations of decision theory by investigating it” is a more reasonable guess a priori than “LW people will respond to … by attacking the person who raises it because it threatens their beliefs”. Of course the latter could in fact be a better prediction than the former, if e.g. there were convincing prior examples; but that’s why “what’s your evidence?” is a reasonable question.

• The Solomonoff issue is interesting. In 2012, private_messaging made this argument against the claim that SI would prove MWI.

https://​​www.lesswrong.com/​​posts/​​6Lg8RWL9pEvoAeEvr/​​raising-safety-consciousness-among-agi-researchers?commentId=mJ53MeyRzZK6iqDPi

So,the program running the SWE, the MWI ontology, outputs information about all worlds on a single output tape, they are going to have to be concatenated or interleaved somehow. Which means that to make use of the information, you gave to identify the subset of bits relating to your world. That’s extra complexity which isn’t accounted for because it’s being done by hand, as it were..

The standard objection, first made by Will_Sawin is that a computational model of MWI is only more complex in space, which , for the purposes of SI, doesn’t count. But that misses the point:an SI isn’t just an ontological model, it has to match empirical data as well.

In fact, if you discount the complexity of the process by which one observer picks out their observations from a morass of data, MWI isn’t the preferred ontology. The easisest way of generating data that contains any substring is a PRNG, not MWI. You basically ending up proving that “everything random” is the simplest explanation

Here’s the messenger-shooting that Private_messaging received ( from the usual shooter)

“Private_messaging earned a “Do Not Feed!” tag itself through consistent trolling”

While it’s true that PM was rude and abrupt tonally, that doesn’t invalidate their argument.

I think the argument remains valid , since I have never seen a relevant refutation.

• I’m not sure which of two arguments private_messaging is making, but I think both are wrong.

Argument 1. “Yudkowsky et al think many-worlds interpretations are simpler than collapse interpretations, but actually collapse interpretations are simpler because unlike many-worlds interpretations they don’t have the extra cost of identifying which branch you’re on.”

I think this one is wrong because that cost is present with collapse interpretations too; if you’re trying to explain your observations via a model of MWI, your explanation needs to account for what branch you’re in, and if you’re trying to explain them via a model of a “collapse” interpretation of QM, it instead needs to account for the random choices of measurement results. The information you need to account for is exactly the same in the two cases.

So maybe instead the argument is more like this:

Argument 2. “Yudkowsky et al think many-worlds interpretations are simpler than collapse interpretations, because they are ‘charging’ collapse interpretations for the cost of identifying random measurement results. But that’s wrong because the same costs are present in MW interpretations.”

I think this one is wrong because that isn’t why Yudkowsky et al think MW interpretations are simpler. They think MW interpretations are simpler because a “collapse” interpretation needs to do the same computation as an MW interpretation and also actually make things collapse. I am not 100% sure that this is actually right: it could conceivably turn out that as far as explaining human observations of quantum phenomena goes, you actually need some notion more or less equivalent to that of “Everett branch”, and you need to keep track of them in your explanation, and the extra bookkeeping with an MW model of the underlying physics is just as bad as the extra model-code with a collapse model of the underlying physics. But if it’s wrong I don’t think it’s wrong for private_messaging’s reasons.

But, still, private_messaging’s argument is an interesting one, and it’s terrible to call him a troll for making it.

… Except that no one did call him a troll for making that argument.

What actually happened when he made that argument was that various people politely disagreed and offered counterarguments. The “consistent trolling” remark was somewhere entirely different, and its context was that private_messaging had been found to have something like five sockpuppets on LW, and was using them to post comments agreeing with one another, and the user who made the “consistent trolling” remark—by the way, that was wedrifid, not Yudkowsky, and I’m not sure why you call them “the usual shooter”—was saying (I paraphrase) “having sockpuppets as such isn’t so bad, and private_messaging was the user’s second account and not really a problem (it was also super-trollish, but that’s a separate issue), but the subsequent sockpuppets were just created to abuse the system and that’s not acceptable”.

Well, OK. But, still, wedrifid called private_messaging a troll. Was that unreasonable? Note that even a troll can say correct and/​or interesting things sometimes; trolling is precisely about what you do “tonally”. So, here are a few comments from private_messaging. Judge for yourself whether there’s anything trollish about them.

Here:

[Yudkowsky is] spreading utter nonsense similar in nature to anti vaccination campaigning. [...] complete misinformed BS that—if he ever gains traction—will be inspiration to more of [Unabomber-style anti-tech terrorism]. I’m not charitable to any imams, any popes, any priests, and any cranks.

Here:

Seriously, why should anyone think that SI is anything more than “narcissistic dilettantes who think they need to teach their awesome big picture ideas to the mere technicians that are creating the future”, to paraphrase one of my friends?

Here:

There isn’t a lot to cite to counter utter nonsense that incompetents (SIAI) promotes. There’s a lot of fundamentals to learn, though, to be able to not fall for such nonsense. [...] The SIAI position is not even wrong. It is hundred percent misguided due to lack of understanding of simple fundamentals, and multitude of conflations of the concepts that are distinct to anyone in the field.

I dunno, seems a bit trollish to me. Again, not because it’s necessarily wrong but because it’s needlessly confrontational; private_messaging was rather fond of saying “X is wrong and stupid and you people are idiots for thinking it” when “X is wrong” would have sufficed.

• I don’t think it’s a string of objections; it’s one (reasonable) objection made at length.

The objection is that you’re not really doing Solomonoff induction or anything like it unless you’re considering actual programs and people saying things like “many worlds is simpler than collapse” never actually do that.

As I say, I think this is a reasonable criticism, but (in the specific context here of comparing MW to collapse) I think there’s a reasonable response to it: “Collapse interpretations have to do literally all the same things that many-worlds interpretations do—i.e., compute how the wavefunction evolves—as well as something extra, namely identifying events as measurements, picking measurement results at random, and replacing the wavefunction with one of the eigenfunctions. No matter how you fill in the formal details, that is going to require a longer program.”

(For the avoidance of doubt, the “picking measurement results at random” bit isn’t reckoning the random numbers as part of the complexity cost—as discussed elsewhere in this discussion, it seems like that cost is the same whatever interpretation you pick; it’s the actual process of picking results at random. The bit of your code that calls `random()`, not the random bits you get by calling it.)

This is still a bit hand-wavy, and it’s not impossible that it might turn out to be wrong for some subtle reason. But it does go beyond “X sure seems simpler to me than Y”, and it’s based on some amount of actual thinking about (admittedly hypothetical) actual programs.

(I guess there are a few other kinda-objections in there—that Solomonoff induction is underspecified because you have to say what language your programs are written in, that someone said “Copenhagen” when they meant “collapse”, and that some interpretations of QM with actual wavefunction collapse in aren’t merely interpretations of the same mathematics as every other interpretation but have actual potentially observable consequences. The first is indeed an issue, but I haven’t heard anyone seriously suggest that any plausible difference in language would change the order of preference between two complete physical theories, if we were actually able to codify them with enough precision; the second is a terminological nitpick, though certainly one worth picking; the third isn’t really an objection at all but is again an observation worth making. But the main point of that comment is a single objection.)

• If your original comment had said “at least some”, I would have found it more reasonable.

As stated , it was exactly as reasonable as yours. There is not and never was any objective epistemic or rational reason to treat the two comments differently.

but that’s why “what’s your evidence?” is a reasonable question.

You havent’ shown that in any objective way, because it’s only an implication of :-

also think “LW people will respond to an interesting mathematical question about the foundations of decision theory by investigating it” is a more reasonable guess a priori than “LW people will respond to … by attacking the person who raises it because it threatens their beliefs”.

..which is just an opinion. You have two consistent claims ..that my claim is apriori less likely, and that it needs to be supported by evidence. But they aren’t founded on anything.

• I think ChristianKl gave one excellent rational reason to treat the two comments differently: all else being equal, being nice improves the quality of subsequent discussion and being nasty makes it worse, so we should apply higher standards to nastiness than to niceness.

Another rational reason to treat them differently would be if one of them is more plausible, given the available evidence, than the other. I’ve already explained at some length why I think that’s the case here. Perhaps others feel the same way. Of course you may disagree, but there is a difference between “no rational reason” and “no reason TAG agrees with”.

I have given some reasons why I think my claim more plausible than yours. I’m sorry if you find that opinion none the less “not founded on anything”. It seems to me that if we want a foundation firmer than your or my handwaving about what sorts of things the LW community is generally likely to do, we should seek out concrete examples. You implied above that you have several highly-relevant concrete examples (“criticism of the lesswrongian version of Bayes, and the lesswrongian version of Aumann, and the lesswrongian version of Solomonoff, and of the ubiquitous utility function, and the MWI stuff...”) where someone has provided a refutation of things widely believed on LW; I don’t know what specific criticisms you have in mind, but presumably you do; so let’s have some links, so we can see (1) how closely analogous the cases you’re thinking of actually were and (2) how the LW community did in fact react.

I’m finding this discussion frustrating because it feels as if every time you refer to something I said you distort it just a little, and then I have to choose between going along with the wrong version and looking nitpicky for pointing out the distortion. On this occasion I’ll point out a distortion. I didn’t say that your claim “needs” to be supported by evidence. In fact, I literally wrote “You don’t have to provide evidence”. I did ask the question “what’s your evidence?” and claimed that that was a reasonable question. I find it kinda baffling that you think the idea that “what’s your evidence?” is a reasonable question is a claim that’s not “founded on anything”; it seems to me that that’s pretty much always a reasonable question when someone makes an empirical claim. (I think the only exceptions would be where it’s already perfectly obvious what the evidence is, or maybe where the claim in question is already the conventional wisdom and no one’s offered reasons to think that it might be wrong.)

For the avoidance of doubt, I am not claiming that when that’s a reasonable question it’s then reasonable e.g. to demand a detailed accounting of the evidence and assume the claim being made is false if the person making it doesn’t give one. “I don’t have any concrete evidence; it just feels that way to me” could be a perfectly reasonable answer; so could “I don’t remember the details, but if you look at X and Y and Z you should find all the evidence you need”; so could “No concrete evidence, but this situation looks just like such-and-such another situation where X happened”; etc.

• I think ChristianKl gave one excellent rational reason to treat the two comments differently: all else being equal, being nice improves the quality of subsequent discussion and being nasty makes it worse, so we should apply higher standards to nastiness than to niceness

Here’s an argument against it: having strong conventions against nastiness means you never get any kind of critique or negative feedback at all, and essentially just sit in an echo chamber. Treating rationality as something that is already perfect is against rationality.

Saying “we accept criticism , if it is good criticism” amounts to the same thing, because you can keep raising the bar.

Saying “we accept criticism , if it comes from the right person” amounts to the same thing, because you nobody has to be the right person.

Saying “we accept criticism , if it is nice” amounts to the same thing, because because being criticized never feels entirely nice.

But you understand all that , so long as it applies to an outgroup.

We are, I think, dealing with that old problem of motivated cognition. As Gilovich says: “Conclusions a person does not want to believe are held to a higher standard than those they do.

EY gives the examples of creationists, who are never convinced by any amount of fossils. That example, you can understand.

• Yes, too-strong conventions against nastiness are bad. It doesn’t look to me as if we have those here, any more than it looks to me as if there’s much of a shooting-the-messenger culture.

I’ve been asking you for examples to support your claims. I’ll give a few to support mine. I’m not (at least, not deliberately) cherry-picking; I’m trying to think of cases where something has come along that someone could with a straight face argue is something like a refutation of something important to LW:

• Someone wrote an article called “The death of behavioral economics”. Behavioural economics is right up LW’s street, and has a lot of overlap with the cognitive-bias material in the “Sequences”. And the article specifically attacks Kahneman and Tversky (founders of the whole heuristics-and-biases thing), claiming that their work on prospect theory was both incompetent and dishonest. So … one of the admins of LW posted a link to it saying it was useful, and that linkpost is sitting on a score of +143 right now.

• Holden Karnofsky of GiveWell took a look at the Singularity Institute (the thing that’s now called MIRI) as a possible recipient of donations and wrote a really damning piece about it. Luke Muelhauser (then the executive director of the SI) and Eliezer Yudkowsky (founder of the SI) responded by … thanking HK for his comments and agreeing that there was a lot of merit to his criticisms. That really damning piece is currently on a score of +325. (Hazy memory tells me that the highest-voted thing ever on LW is Yvain’s “Diseased thinking about disease”; I may or may not be right about that, but at any rate that one’s on +393. Just for context.)

and I had a look for comments that were moderately nasty but with some sort of justification, to see how they were received:

• Consider Valentine’s post about enlightenment experiences. Here are some of the things said in response: Ben Pace’s comment, saying serious meditation seems likely to be a waste of time, its advocates can’t show actual evidence of anything useful, Valentine is quite likely just confused, etc. +60. Said Achmiz’s comment, making similar points similarly bluntly. +31. clone of saturn’s response to that, calling other things said in the discussion “pretty useless” and “obnoxious”. +29. There are plenty more examples in that discussion of frankness-to-the-point-of-rudeness getting (1) positive karma and (2) constructive responses.

Unfortunately, searching for moderately-nasty-but-at-least-kinda-justified comments is difficult because (1) most comments aren’t of that kind and (2) it’s not the sort of thing that e.g. Google can help very much with. (I did try searching for various negative words but that didn’t come up with much.)

But, overall, I repeat that my impression is that the usual LW response to criticism is to take it seriously, and that LW is not so intolerant of negativity as to be a problem. I am willing to be persuaded that I’m wrong, but I’d want to see some actual evidence rather than just a constant tone of indignation that I won’t make the leap from “Eliezer called another AI researcher an idiot once” to any broad conclusion about how LW people respond to criticism.

• I’m well aware that the big people get treated right. That’s compatible with the little people being shot. Look how Haziq has been treated for asking a question.

• He’s asked a lot of questions. His various LW posts are sitting, right now, at scores of +4, +4, +2, +11, +9, +10, +4, +1, +7, −2. This one’s slightly negative; none of the others are. It’s not the case that this one got treated more harshly because it suggested that something fundamental to LW might be wrong; the same is true of others, including the one that’s on +11.

This question (as well as some upvotes and slightly more downvotes) received two reasonably detailed answers, and a couple of comments (one of them giving good reason to doubt the premise of the question), all of them polite and respectful.

Unless your position is that nothing should ever be downvoted, I’m not sure what here qualifies as being “shot”.

(I haven’t downvoted this question nor any of Haziq’s others; but my guess is that this one was downvoted because it’s only a question worth asking if Halpern’s counterexample to Cox’s theorem is a serious problem, which johnswentworth already gave very good reasons for thinking it isn’t in response to one of Haziq’s other questions; so readers may reasonably wonder whether he’s actually paying any attention to the answers his questions get. Haziq did engage with johnswentworth in that other question—but from this question you’d never guess that any of that had happened.)

And I supplied some, which you then proceeded to nitpick, implying that it wasn’t good enough, implying that very strong evidence is needed.

• I do indeed consider that your evidence (“Eliezer Yudkowsky called Richard Loosemore an idiot!”) is not good enough to establish the claim you were making (“we should expect LW people to shoot the messenger if someone reports a refutation of an idea that’s been important here”).

However, the point isn’t that “very strong evidence is needed”, the point is that the evidence you offered is very weak.

(Maybe you disagree and think the evidence you offered is not very weak. If so, maybe you’d like to explain why. I’ve explained in more detail why I think it such weak evidence, elsewhere in this thread. Your defence of it has mostly amounted to “it is so an ad hominem”, as if the criticism had been “TAG says it was an ad hominem but it wasn’t”; again, I’ve explained elsewhere in the thread why I think that entirely misses the point.)

• If Loosemore had called Yudkowsky an idiot, you would not be saying “maybe he is”.

• For what it’s worth, I think it’s possible that he is, in the relevant sense. As I said elsewhere, the most likely scenario in which EY is wrong about RL being an “idiot” (by which, to repeat, I take it he meant “person obstinately failing to grasp an essential point”) is one in which on the relevant point RL is right and EY wrong, in which case EY would indeed be an “idiot”.

But let’s suppose you’re right. What of it? I thought the question here was whether LW people shoot the messenger, not whether my opinions of Eliezer Yudkowsky and Richard Loosemore are perfectly symmetrical.

• In common sense terms, telling an audience that the messenger is an idiot who shouldn’t be listened to because he’s an idiot, is shooting the messenger. It’s about as central an classic an example you can get. What else would it be?

• Unfortunately some messengers are idiots (we have already established that most likely either Yudkowsky or Loosemore is an idiot, in this particular scenario). Saying that someone is an idiot isn’t shooting the messenger in any culpable sense if in fact they are an idiot, nor if the person making the accusation has reasonable grounds for thinking they are.

So I guess maybe we actually have to look at the substance of Loosemore’s argument with Yudkowsky. So far as I can make out, it goes like this.

• Yudkowsky says: superintelligent AI could well be dangerous, because despite our efforts to arrange for it to do things that suit us (e.g., trying to program it to do things that make us happy) a superintelligent AI might decide to do things that in fact are very bad for us, and if it’s superintelligent then it might well also be super-powerful (on account of being super-persuasive, or super-good at acquiring money via the stock market, or super-good at understanding physics better, or etc.).

• Loosemore says: this is ridiculous, because if an AI were really superintelligent in any useful sense then it would be smart enough to see that (e.g.) wireheading all the humans isn’t really what we wanted; if it isn’t smart enough to understand that then it isn’t smart enough to (e.g.) pass the Turing test, to convince us that it’s smart, or to be an actual threat; for that matter, the researchers working on it would have turned it off long before, because its behaviour would necessarily have been bizarrely erratic in other domains besides human values.

The usual response to this by LW-ish people is along the lines of “you’re assuming that a hypothetical AI, on finding an inconsistency between its actual values and the high-level description of ‘doing things that suit its human creators’, would realise that its actual values are crazy and adjust them to match that high-level description better; but that is no more inevitable than that humans, on finding inconsistencies between our actual values and the high-level description of ‘doing things that lead us to have more surviving descendants’, would abandon our actual values in order to better serve the values of Evolution”. To me this seems sufficient to establish that Loosemore has not shown that a hypothetical AI couldn’t behave in clearly-intelligent ways that mostly work towards a given broad goal, but in some cases diverge greatly from it.

There’s clearly more to be said here, but this comment is already rather long, so I’ll skip straight to my conclusion: maybe there’s some version of Loosemore’s argument that’s salvageable as an argument against Yudkowsky-type positions in general, but it’s not clear to me that there is, and while I personally wouldn’t have been nearly as rude as Yudkowsky was I think it’s very much not clear that Yudkowsky was wrong. (With, again, the understanding that “idiot” here doesn’t mean e.g. “person scoring very badly in IQ tests” but something like “person who obstinately fails to grasp a fundamental point of the topic under discussion”.)

I don’t think it’s indefensible to say that Yudkowsky was shooting the messenger in this case. But, please note, your original comment was not about what Yudkowsky would do; it was about what the LW community in general would do. What did the LW community in general think about Yudkowsky’s response to Loosemore? They downvoted it to hell, and several of them continued to discuss things with Loosemore.

One rather prominent LWer (Kaj Sotala, who I think is an admin or a moderator or something of the kind here) wrote a lengthy post in which he opined that Loosemore (in the same paper that was being discussed when Yudkowsky called Loosemore an idiot) had an important point. (I think, though, that he would agree with me that Loosemore has not demonstrated that Yudkowsky-type nightmare scenarios are anything like impossible, contra Loosemore’s claim in that paper that “this entire class of doomsday scenarios is found to be logically incoherent at such a fundamental level that they can be dismissed”, which I think is the key question here. Sotala does agree with Loosemore than some concrete doomsday scenarios are very implausible.) He made a linkpost for that here on LW. How did the community respond? Well, that post is at +23, and there are a bunch of comments discussing it in what seem to me like constructive terms.

So, I reiterate: it seems to me that you’re making a large and unjustified leap from “Yudkowsky called Loosemore an idiot” to “LW should be expected to shoot the messenger”. Y and L had a history of repeatedly-unproductive interactions in the past; L’s paper pretty much called Y an idiot anyway (by implication, not as frankly as Y called L an idiot); there’s a pretty decent case to e made that L was an idiot in the relevant sense; other LWers did not shoot Loosemore even when EY did, and when his objections were brought up again a few years later there was no acrimony.

[EDITED to add:] And of course this is only one case; even if Loosemore were a 100% typical example of someone making an objection to EY’s arguments, and even if we were interested only in EY’s behaviour and not anyone else, the inference from “EY was obnoxious to RL” to “EY generally shoots the messenger” is still pretty shaky.

• My reconstruction of Loosemore’s point is that an AI wouldnt have two sets of semantics , one for interpreting verbal commands, and another for negotiating the world and doing things.

My reconstruction of Yudkowkys argument is that it depends on what I’ve been calling the Ubiquitous Utility Function. If you think of any given AI as having a separate module where its goals or values are hard coded then the idea that they were hard coded wrong, but the AI is helpless to change them, is plausible.

Actual AI researchers don’t believe in ubiquitous UF’s because only a few architectures gave them. EY believes in them for reasons unconnected with empirical evidence about AI architectures.

• If Loosemore’s point is only that an AI wouldn’t have separate semantics for those things, then I don’t see how it can possibly lead to the conclusion that concerns about disastrously misaligned superintelligent AIs are absurd.

I do not think Yudkowsky’s arguments assume that an AI would have a separate module in which its goals are hard-coded. Some of his specific intuition-pumping thought experiments are commonly phrased in ways that suggest that, but I don’t think it’s anything like an essential assumption in any case.

E.g., consider the “paperclip maximizer” scenario. You could tell that story in terms of a programmer who puts something like “double objective_function() { return count_paperclips(DESK_REGION); }” in their AI’s code. But you could equally tell it in terms of someone who makes an AI that does what it’s told, and whose creator says “Please arrange for there to be as many paperclips as possible on my desk three hours from now.”.

(I am not claiming that any version of the “paperclip maximizer” scenario is very realistic. It’s a nice simple example to suggest the kind of thing that could go wrong, that’s all.)

Loosemore would say: this is a stupid scenario, because understanding human language in particular implies understanding that that isn’t really a request to maximize paperclips at literally any cost, and an AI that lacks that degree of awareness won’t be any good at navigating the world. I would say: that’s a reasonable hope but I don’t think we have anywhere near enough understanding of how AIs could possibly work to be confident of that; e.g., some humans are unusually bad at that sort of contextual subtlety, and some of those humans are none the less awfully good at making various kinds of things happen.

Loosemore claims that Yudkowsky-type nightmare scenarios are “logically incoherent at a fundamental level”. If all that’s actually true is that an AI triggering such a scenario would have to be somewhat oddly designed, or would have to have a rather different balance of mental capabilities than an average human being, then I think his claim is very very wrong.

• If Loosemore’s point is only that an AI wouldn’t have separate semantics for those things, then I don’t see how it can possibly lead to the conclusion that concerns about disastrously misaligned superintelligent AIs are absurd.

If there’s one principle argument that it is highly likely for an ASI to be an existential threat, then refuting it refutes claims about ASI and existential threat.

Maybe you think there are other arguments.

E.g., consider the “paperclip maximizer” scenario. You could tell that story in terms of a programmer who puts something like “double objective_function() { return count_paperclips(DESK_REGION); }” in their AI’s code. But you could equally tell it in terms of someone who makes an AI that does what it’s told, and whose creator says “Please arrange for there to be as many paperclips as possible on my desk three hours from now.”.

If it obeys verbal commands ,you could to it to stop at any time. That’s not a strong likelihood of existential threat. How could.it kill us all in three hours?

loosemore claims that Yudkowsky-type nightmare scenarios are “logically incoherent at a fundamental level”. If all that’s actually true is that an AI triggering such a scenario would have to be somewhat oddly designed,

I’ll say! Its logically possible to design a car without brakes or a steering wheel, but it’s not likely. Now you don’t have an argument in favour of there being a strong likelihood of existential threat from ASI.

• If Loosemore’s point is only that an AI wouldn’t have separate semantics for “interpreting commands” and for “navigating the world and doing things”, then he hasn’t refuted “one principal argument” for ASI danger; he hasn’t refuted any argument for it that doesn’t actually assume that an AI must have separate semantics for those things. I don’t think any of the arguments actually made for ASI danger make that assumption.

I think the first version of the paperclip-maximizer scenario I encountered had the hapless AI programmer give the AI its instructions (“as many paperclips as possible by tomorrow morning”) and then go to bed, or something along those lines.

You seem to be conflating “somewhat oddly designed” with “so stupidly designed that no one could possibly think it was a good idea”. I don’t think Loosemore has made anything resembling a strong case for the latter; it doesn’t look to me as if he’s even really tried.

For Yudkowskian concerns about AGI to be worth paying attention to, it isn’t necessary that there be a “strong likelihood” of disaster if that means something like “at least a 25% chance”. Suppose it turns out that, say, there are lots of ways to make something that could credibly be called an AGI, and if you pick a random one that seems like it might work then 99% of the time you get something that’s perfectly safe (maybe for Loosemore-type reasons) but 1% of the time you get disaster. It seems to me that in this situation it would be very reasonable to have Yudkowsky-type concerns. Do you think Loosemore has given good reason to think that things are much better than that?

Here’s what seems to me the best argument that he has (but, of course, this is just my attempt at a steelman, and maybe your views are quite different): “Loosemore argues that if you really want to make an AGI then you would have to be very foolish to do it in a way that’s vulnerable to Yudkowsky-type problems, even if you weren’t thinking about safety at all. So potential AGI-makers fall into two classes: the stupid ones, and the ones who are taking approaches that are fundamentally immune to the failure modes Yudkowsky worries about. Yudkowsky hopes for intricate mathematical analyses that will reveal ways to build AGI safely, but the stupid potential AGI engineers won’t be reading those analyses, won’t be able to understand them, and won’t be able to follow their recommendations, and the not-stupid ones won’t need them. So Yudkowsky’s wasting his time.”

The main trouble with this is that I don’t see that Loosemore has made a good argument that if you really want to make an AGI then you’d be stupid to do it in a way that’s vulnerable to Yudkowsky-type concerns. Also, I think Yudkowsky hopes to find ways of thinking about AI that both make something like provable safety achievable and clarify what’s needed for AI in a way that makes it easier to make an AI at all, in which case, it might not matter what everyone else is doing.

In any case, this is all a bit of a sidetrack. The point is: Loosemore claimed that the sort of thing Yudkowsky worries about is “logically incoherent at [] a fundamental level”, but even being maximally generous to his arguments I think it’s obvious that he hasn’t shown that; there is a reasonable case to be made that he simply hasn’t understood some of what Yudkowsky has been saying; that is what Y meant by calling L a “permanent idiot”; whether or not detailed analysis of Y’s and L’s arguments ends up favouring one or the other, this is sufficient to suggest that (at worst) what we have here is a good ol’ academic feud where Y has a specific beef with L, which is not at all the same thing as a general propensity for messenger-shooting.

And, to repeat the actually key point: what Yudkowsky did on one occasion is not strong evidence for what the Less Wrong community at large should be expected to do on a future occasion, and I am still waiting (with little hope) for you to provide some of the actual examples you claim to have where the Less Wrong community at large responded with messenger-shooting to refutations of their central ideas. As mentioned elsewhere in the thread, my attempts to check your claims have produced results that point in the other direction; the nearest things I found to at-all-credibly-claimed refutations of central LW ideas met with positive responses from LW: upvotes, reasonable discussion, no messenger-shooting.

• “(I haven’t downvoted this question nor any of Haziq’s others; but my guess is that this one was downvoted because it’s only a question worth asking if Halpern’s counterexample to Cox’s theorem is a serious problem, which johnswentworth already gave very good reasons for thinking it isn’t in response to one of Haziq’s other questions; so readers may reasonably wonder whether he’s actually paying any attention to the answers his questions get. Haziq did engage with johnswentworth in that other question—but from this question you’d never guess that any of that had happened.)”

Sorry, haven’t checked LW in a while. I actually came across this comment when I was trying to delete my LW account due to the “shoot the messenger” phenomenon that TAG was describing.

I do not think that Johnwentworth’s answer is satisfactory. In his response to my previous question, he claims that Cox’s theorem holds under very specific conditions which doesn’t happen in most cases. He also claims that probability as extended logic is justified by empirical evidence. I don’t think this is a good justification unless he happens to have an ACME plausibility-o-meter.

David Chapman, another messenger you (meaning LW) were too quick to shoot, explains the issues with subjective Bayesianism here:

I do agree that this framework is useful but only in the same sense that frequentism is useful. I consider myself a ”pragmatic statistician “ who doesn’t hesitate to use frequentist or Bayesian methods, as long as they are useful, because the justifications for either seem to be equally worse.

‘It might turn out that the way Cox’s theorem is wrong is that the requirements it imposes for a minimally-reasonable belief system need strengthening, but in ways that we would regard as reasonable. In that case there would still be a theorem along the lines of “any reasonable way of structuring your beliefs is equivalent to probability theory with Bayesian updates”.’

I find this statement to be quite disturbing because it seems to me that you are assuming Jaynes-Cox theory to be true first and then trying to find a proof for it. Sounds very much like confirmation bias. Van Horn’s paper could potentially revive Cox’s theorem but nobody’s talking about it because they are not ready to accept that Cox’s theorem has any issues in the first place.

I think the messenger-shooting is quite apparent in LW. It’s the reason why posts that oppose or criticise the “tenets of LW”, that LW members adhere to in a cult-like fashion, are so scarce. For instance, Chapman’s critique of LW seems to have been ignored altogether.

# “The most dangerous ideas in a society are not the ones being argued, but the ones that are assumed.”

C. S. Lewis

• I guess there’s not that much point responding to this, since Haziq has apparently now deleted his account, but it seems worth saying a few words.

• Haziq says he’s deleting his account because of LW’s alleged messenger-shooting, but I don’t see any sign that he was ever “shot” in any sense beyond this: one of his several questions received a couple of downvotes.

• What johnswentworth’s answer about Cox’s theorem says isn’t at all that it “holds under very specific conditions which doesn’t happen in most cases”.

• You’ll get no objection from me to the idea of being a “pragmatic statistician”.

• No, I am not at all “assuming Jaynes-Cox theory to be true first and then trying to find a proof for it”. I am saying: the specific scenario you describe (there’s a hole in the proof of Cox’s theorem) might play out in various ways and here are some of them. Some of them would mean something a bit like “Less Wrong is dead” (though, I claim, not exactly that); some of them wouldn’t. I mentioned some of both.

• I can’t speak for anyone else here, but for me Cox’s theorem isn’t foundational in the sort of way it sounds as if you think it is (or should be?). If Cox’s theorem turns out to be disastrously wrong, that would be very interesting, but rather little of my thinking depends on Cox’s theorem. It’s a bit as if you went up to a Christian and said “Is Christianity dead if the ontological argument is invalid?”; most Christians aren’t Christians because they were persuaded by the ontological argument, and I think most Bayesians (in the sense in which LW folks are mostly Bayesians) aren’t Bayesians because they were persuaded by Cox’s theorem.

• I do not know what it would mean to adhere to, say, Cox’s theorem “in a cult-like fashion”.

The second-last bullet point there is maybe the most important, and warrants a bit more explanation.

Whether anything “similar enough” to Cox’s theorem is true or not, the following things are (I think) rather uncontroversial:

• We should hold most (maybe all) of our opinions with some degree of uncertainty.

• One possible way of thinking about opinions-with-uncertainty is as probabilities.

• If we think about our beliefs this way, then there are some theorems telling us how to adjust them in the light of new evidence, how our beliefs about various logically-related propositions should be related, etc.

• No obviously-better general approach to quantifying strength of beliefs is known.

• To be clear, this doesn’t mean that nothing is known that is ever better at anything than probability theory with Bayesian updates. E.g., “provably approximately correct learning” isn’t (so far as I know) equivalent to anything Bayesian, and it gives some nice guarantees that (so far as I know) no Bayesian approach is known to give. So when what you want is what PACL theory gives you, you should be using PACL.

For me, these are sufficient to justify a generally-Bayes-flavoured approach, by which I mean:

• Of course I don’t literally attach numerical probabilities to all my beliefs. Nor do I think it’s obvious that any real reasoner, given the finite resources we inevitably have, should be explicitly probabilistic about everything.

• But if for some reason I need to think clearly about how plausible I find various possibilities, I generally do it in terms of probabilities. (Taking care, e.g., to notice when there is danger of double-counting things, which is an easy way to go wrong when applying Bayesian probability naively.)

• If I notice that some element of how I think is outright inconsistent with this way of quantifying uncertainty, consider whether it’s mistaken (albeit possibly a useful approximation). E.g., most people’s intuitive judgements of how likely things are produce instances of the (poorly named) “conjunction fallacy”, plausibly often because we tend to apply the “representativeness heuristic”; on reflection I think this genuinely does indicate places where our intuitive judgements are mistaken, and trying to notice it happening and do something cleverer is of some value.

(None of this feels very cultish to me.)

Finding a bug in the proof of Cox’s theorem doesn’t do anything to invalidate any of the above. Finding a concrete case of a structure other than probabilities-with-Bayesian-updates does better (in some sense of “better”) on a problem resembling actual real-world reasoning absolutely might; in particular, it might make it false that “no obviously-better general approach to quantifying strength of beliefs is known”. Halpern’s counterexample to Cox is not, so far as I can tell, like that; it depends essentially on a sort of “sparsity” that doesn’t hold if what you’re trying to assign credences to is all propositions you might consider about how the world is. I think (indeed, I think Halpern pointed this out, but I may be misremembering) you can fix up the proof by adding an assumption saying that this sort of sparsity doesn’t occur, and although that assumption is ugly and technical I think it is a reasonable assumption in real life; and in the most obvious regime where you can’t make that assumption—where everything is finite—the van Horn approach seems to yield essentially the same conclusions with a pretty modest set of assumptions that are reasonable there.

So far as I can tell by introspection, I’m not saying all this because I am determined not to admit the possibility that Cox’s theorem might be all wrong. It looks to me like it isn’t all wrong, because I’ve looked at the alleged issues and some alleged patches for them and I think the patches work and the issues are purely technical. But it could be all wrong, and that possibility feels more interesting than upsetting to me.

• . I don’t think any of the arguments actually made for ASI danger make that assumption.

I do not think there’s actually a great variety of arguments for existential threat from AI. The arguments other than Dopamine Drip, don’t add up to existential threat.

You seem to be conflating “somewhat oddly designed” with “so stupidly designed that no one could possibly think it was a good idea”

Who would have the best idea of what a stupid design is...the person who has designed AIs or the person who hadn’t? If this were any other topic, you would allow that practical experience counts.

The main trouble with this is that I don’t see that Loosemore has made a good argument

That’s irrelevant. The question is whether his argument is so bad it can be dismissed without being addressed.

Also, I think Yudkowsky hopes to find ways of thinking about AI that both make something like provable safety achievable and clarify what’s needed for AI in a way that makes it easier to make an AI at all, in which case, it might not matter what everyone else is doing.

If pure armchair reasoning works, then it doesn’t matter what everyone else is doing. But why would it work? There’s never been a proof of that—just a reluctance to discuss it.

• Even the “dopamine drip” argument does not make that assumption, even if some ways of presenting it do.

Loosemore hasn’t designed actually-intelligent AIs, any more than Yudkowsky has. In fact, I don’t see any sign that he’s designed any sort of AIs any more than Yudkowsky has. Both of them are armchair theorists with abstract ideas about how AI ought or ought not to work. Am I missing something? Has Loosemore produced any actual things that could reasonably be called AIs?

No one was dismissing Loosemore’s argument without addressing it. Yudkowsky dismissed Loosemore having argued with him about AI for years.

I don’t know what your last paragraph means. I mean, connotationally it’s clear enough: it means “boo, Yudkowsky and his pals are dilettantes who don’t know anything and haven’t done anything valuable”. But beyond that I can’t make enough sense of it to engage with it.

• “If pure armchair reasoning works …”—what does that actually mean? Any sort of reasoning can work or not work. Reasoning that’s done from an armchair (so to speak) has some characteristic failure modes, but it doesn’t always fail.

• “Why would it work?”—what does that actually mean? It works if Yudkowsky’s argument is sound. You can’t tell that by looking at whether he’s sitting in an armchair; it depends on whether its (explicit and implicit) premises are true and whether the logic holds; Loosemore says there’s an implicit premise along the lines of “AI systems will have such-and-such structure” which is false; I say no one really knows much about the structure of actual human-level-or-better AI because no one is close to building one yet, I don’t see where Yudkowsky’s argument actually assumes what Loosemore says it does, and Loosemore’s counterargument is more or less “any human-or-better AI will have to work the way I want it to work, and that’s just obvious” and it isn’t obvious.

• “There’s never been a proof of that”—a proof of what, exactly? A proof that armchair reasoning works? (Again, what would that even mean? Some armchair reasoning works, some doesn’t.)

• “Just a reluctance to discuss it”—seems to me there’s been a fair bit of discussion of Loosemore’s claims on LW. (Including in the very discussion where Yudkowsky called him an idiot.) And, as I understand it, there was a fair bit of discussion between Yudkowsky and Loosemore, but by the time of that discussion Yudkowsky had decided Loosemore wasn’t worth arguing with. This doesn’t look to me like a “reluctance to discuss” in any useful sense. Yudkowsky discussed Loosemore’s ideas with Loosemore for a while and got fed up of doing so. Other LW people discussed Loosemore’s ideas (with Loosemore and I think with one another) and didn’t get particularly fed up. What exactly is the problem here, other than that Yudkowsky was rude?

• it’s just that this seems like the sort of question that a lot of LWers are very interested in

Seems to whom? It seems to me that a lot of less wrongers would messenger-shoot. Why do I have to provide evidence for the way things seem to me, but you don’t need to provide evidence of the way things seem to you?

BTW, in further evidence, Haziqs question has been downvoted to −2.

I don’t understand what you mean by “That’s an objection that could be made to anything”.

Anything is not necessarily true.

• You don’t have to provide evidence. I’m asking you to because it would help me figure out how much truth there is in your accusation. You might be able to give reasons that don’t exactly take the form of evidence (in the usual sense of “evidence”), which might also be informative. If you can’t or won’t provide evidence, I’m threatening no adverse consequence other than that I won’t find your claim convincing.

If the fact that my original guess at what LW folks would do in a particular situation isn’t backed by anything more than my feeling that a lot of them would find the resulting mathematical and/​or philosophical questions fun to think about means that you don’t find my claim convincing, fair enough.

Anything is not necessarily true.

For sure. But my objection wasn’t “this is not necessarily true”, so I’m not sure why that’s relevant.

… Maybe I need to say explicitly that when I say that it’s “possible” to be both an AI researcher and what I take Eliezer to have meant by an idiot, I don’t merely mean that it’s not a logical impossibility, or that it’s not precluded by the laws of physics; I mean that, alas, foolishness is to be found pretty much everywhere, and it’s not tremendously unlikely that a given AI researcher is (in the relevant sense) an idiot. (Again, I agree that AI researchers are less likely to be idiots than, say, randomly chosen people.)

• You don’t have to provide evidence.

Not in absolute terms, no. But in relative terms, people are demanding that I supply evidence to support my guess, but not demanding the same from you.

Maybe I need to say explicitly that when I say that it’s “possible” to be both an AI researcher and what I take Eliezer to have meant by an idiot, I don’t merely mean that it’s not a logical impossibility, or that it’s not precluded by the laws of physics; I mean that, alas, foolishness is to be found pretty much everywhere, and it’s not tremendously unlikely that a given AI researcher is (in the relevant sense) an idiot.

Which , again, is just to say that the apparent ad hom was possibly true, which, again , is an excuse you could make for anything. Maybe Smith whom Brown has accused of being a wife beater, actually is a wife beater.

• (Note: you posted two duplicate comments; I’ve voted this one up and the other one down so that there’s a clear answer to the question “which one is canonical?”. Neither the upvote nor the downvote indicates any particular view of the merits of the comment.)

• Well, maybe he is. If you’re going to use “Brown accused Smith of beating his wife” as evidence that Brown is terrible and so is everyone associated with him, it seems like some evidence that Brown’s wrong would be called for. (And saying “Smith is a bishop” would not generally be considered sufficient evidence, even though presumably most bishops don’t beat their wives.)

• That’s not how it works. An apparent ad hom is usually taken as evidence that an ad hom took place. You are engaging in special pleading. This is like the way that people who are suffering from confirmation bias will demand very high levels of evidence before they change their minds. Not that you are suffering from confirmation bias.

Brown is terrible and so is everyone associated with him,

Another wild exageration of what I said.

• The larger point here is that the link between “Eliezer Yudkowsky called Richard Loosemore an idiot” and “People on Less Wrong should be expected to shoot the messenger if someone turns up saying that something many of them believe is false” is incredibly tenuous.

I mean, to make that an actual argument you’d need something like the following steps.

• EY called RL an idiot.

• EY did not have sufficient grounds for calling RL an idiot.

• EY was doing it because RL disagreed with him.

• EY has/​had a general practice of attacking people who disagree with him.

• Other people on LW should be expected to behave the same way as EY.

• So if someone comes along expressing disagreement, we should expect people on LW to attack them.

I’ve been pointing out that the step from the first of those to the second is one that requires some justification, but the same is true of all the others.

So, anyway: you’re talking as if you’d said “EY’s comment was an ad hominem attack” and I’d said “No it wasn’t”, but actually neither of those is right. You just quoted EY’s comment and implied that it justified your opinion about the LW population generally; and what I said about it wasn’t that it wasn’t ad hominem. It was a personal attack (I wouldn’t use the specific term “ad hominem” because that too strongly suggests the fallacy of argumentum ad hominem, which is when you say “X’s claim is wrong because X is an idiot” or something like that, and that isn’t what EY was doing; but it was, for sure, a personal attack). I just don’t think it’s much evidence of a general messenger-shooting tendency on LW, and I think that to make it into evidence of that you’d need to justify each step in (something like) the chain of propositions above, and you haven’t made the slightest attempt to do so. And that, not whether it was an ad hominem, is what we are disagreeing about.

Some comments on those steps. First step: Yes, EY certainly called RL an idiot, though I don’t think what he meant by it was quite the usual meaning of “idiot” and in particular I think what he meant by it is more compatible with being a professional AI researcher than the usual meaning of “idiot” is; specifically, I think he meant something like “There are fundamental points in the arguments I’ve been making that RL obstinately fails to grasp, and it seems no amount of discussion will show him the error of his ways”. Obstinately failing to grasp a particular point is, alas, a rather common failure mode even of many otherwise very impressive human brains. Note that if EY is wrong about this, the most likely actual situation is that EY is obstinately failing to grasp a (correct) fundamental point being made by RL. So one way or another, a professional AI researcher is an idiot in the relevant sense. So that circumstance is not so extraordinary that when someone claims it’s so we should jump to the conclusion that they are being unreasonable.

Second step: Kinda: my understanding is that EY and RL had been around more or less the same argumentative circle many times and made no progress. I think EY would have been clearly justified in saying “either RL is an idiot or I am”; I shall not try to pass judgement on how reasonable it was for him to be confident about which of them was missing something fundamental. Third step: No: I’m pretty sure EY was as forceful as he was because of past history of unproductive discussions with RL, and would likely not have said the same if someone else had raised the same issues as RL did, even though he’d have disagreed with them just as much. Fourth step: Kinda; while I don’t think it would be fair to expect EY to attack someone just because they disagreed, I do think he is generally too quick to attack. Fifth step: No, not at all; one person’s behaviour is not a reliable predictor of another’s. “Oh, but EY is super-high-status around here and everyone admires him!” Well, note for instance that the comment we’re discussing is sitting on a score of −23 right now; maybe the LW community admires Eliezer, but they don’t seem to admire this particular aspect. Sixth and final step: Not really: as I already said, I don’t think EY has a general messenger-shooting policy, so even if the LW community imitated everything he did we would not be justified in expecting them to do that.

• You don’t have to provide evidence.

Not in absolute terms, no. But in relative terms, people are demanding that I supply evidence to support my guess, but not demanding the same from you.

Maybe I need to say explicitly that when I say that it’s “possible” to be both an AI researcher and what I take Eliezer to have meant by an idiot, I don’t merely mean that it’s not a logical impossibility, or that it’s not precluded by the laws of physics; I mean that, alas, foolishness is to be found pretty much everywhere, and it’s not tremendously unlikely that a given AI researcher is (in the relevant sense) an idiot.

Which , again, is just to say that the apparent ad hom was possibly true, which, again , is an excuse you could make for anything. Maybe Smith whom Brown has accused of being a wife beater, actually is a wife beater.

• It might also be proper to get downvotes for pointing out without an explanation something that clearly wouldn’t be happening. In any case, the analogy between this downvoting and the hypothetical coverup is unconvincing.

Not sure if I personally agree with the downvoting, some anti-echo-chamber injunctions might be good to uphold even in the face of a bounded amount of very strange claims. But maybe only those that come with some sort of explanation.

• something that clearly wouldn’t be happening.

Why ?

• “Clearly” and “it seems” are both the same, bad, argument. They both pass off a subjective assement as a fact

• My main criterion for up-voting comments and posts is whether I think others would be likely to benefit from reading them. This topic has come up a few times already with much better analysis, so I did not up-vote.

My main criterion for down-voting is whether I think it is actively detrimental to thought or discussion about a topic. Your post doesn’t meet that criterion either (despite the inflammatory title), so I did not down-vote it.

Your comment in this thread does meet that criterion, and I’ve down-voted it. It is irrelevant to the topic of the post, does not introduce any interesting argument, applies a single judgement without evidence to a diverse group of people, and is adversarial, casting disagreement with or mere lack of interest in your original point in terms of deliberate suppression of a point of view.

So no, you have not been down-voted for “pointing it out”. You have (at least in my case) been down-voted for poisoning the well.

• applies a single judgement without evidence to a diverse group of people,

So does this:

would expect LessWrongers to be working excitedly on figuring out what things need how much revision (or discarding completely)

• Yes, it does. In a charitable way.

• Exactly. There isn’t a generally followed rule that you can’t make sweeping assertions, or that everyone must be supported by evidence. What people actually dislike is comments that portray rationalism negatively, and those are held to a much higher standard than positive comments. But of course, no one wants to state an explicit rule that “we operate a double standard”.

• I’m fine with someone commenting under a post: “I really liked that you wrote this”. I’m not fine with someone writing a comment that just contains “I really dislike you wrote this”.

That’s a double standard and I’m happily arguing in favor of it as it makes the interaction in a forum more friendly.

• Are you fine with downvoting?

And what about the epistemic double standard ?

• Yes, I’m fine with someone downvoting lazy criticism. Having different standards for different things is good. It seems to me very strange to expect that standards should be the same for all actions.

If you look at medicine you see they have huge epistemic double standards for benefits and side effects of drugs.

If I imagine having the same epistemic standards for allowing people to claim “Bob is a rapist” and allowing them to claim “Bob has good humor” that would seem to me really strange. We even have laws that enforce that double standard because largely society believes that it’s good to have epistemic double standards in that regard.

• Allowing only evidence of certain form in some roles is a way of making it easier to judge when exploitability/​bias/​illegibility are expected to be an issue.

This is a tradeoff. It’s wasteful when it’s not actually needed, and often enough it’s impossible to observe the form without losing sight of the target.

• I can at least agree that:

We are, I think, dealing with that old problem of motivated cognition. As Gilovich says: “Conclusions a person does not want to believe are held to a higher standard than those they do.

• That is only very vaguely related to what I was saying. I was essentially pointing out that even benign examples of double standards serve particular purposes that don’t always apply, and when they don’t, it’s best to get rid of the double standards.

• Are you in favour of downvoting lazy praise?

• Did you even read the last line of my comment?

I down-voted you for poisoning the well.

• When I quoted evidence of EY ad-homming someone?

• [ ]
[deleted]
• Most of what’s in CFAR’s handbook doesn’t depend on Cox’s theorem. Very little that happened on LessWrong in the last years is affected in any way. Most of what we talk about is but button up derived from probability theory. Even for parts like credence calibration that are very much derived from it Cox theorem being valid or not has little effect on the value of a practice like forecasting Telock style.

• I thought johnswentworth’s comment on one of your earlier posts, along with an ocean of evidence from experience, was adequate to make me feel like that our current basic conception of probability is totally fine and not worth my time to keep thinking about.

• FWIW, Van Horn says:

“There has been much unnecessary controversy over Cox’s Theorem due to differing implicit assumptions as to the nature of its plausibility function. Halpern [11, 12] claims to demonstrate a counterexample to Cox’s Theorem by examining a finite problem domain, but his argument presumes that there is a different plausibility function for every problem domain.”