TAG comments on Is LessWrong dead without Cox’s theorem?

TAG 18 Sep 2021 15:59 UTC
1 point

If Loosemore’s point is only that an AI wouldn’t have separate semantics for those things, then I don’t see how it can possibly lead to the conclusion that concerns about disastrously misaligned superintelligent AIs are absurd.

If there’s one principle argument that it is highly likely for an ASI to be an existential threat, then refuting it refutes claims about ASI and existential threat.

Maybe you think there are other arguments.

E.g., consider the “paperclip maximizer” scenario. You could tell that story in terms of a programmer who puts something like “double objective_function() { return count_paperclips(DESK_REGION); }” in their AI’s code. But you could equally tell it in terms of someone who makes an AI that does what it’s told, and whose creator says “Please arrange for there to be as many paperclips as possible on my desk three hours from now.”.

If it obeys verbal commands ,you could to it to stop at any time. That’s not a strong likelihood of existential threat. How could.it kill us all in three hours?

loosemore claims that Yudkowsky-type nightmare scenarios are “logically incoherent at a fundamental level”. If all that’s actually true is that an AI triggering such a scenario would have to be somewhat oddly designed,

I’ll say! Its logically possible to design a car without brakes or a steering wheel, but it’s not likely. Now you don’t have an argument in favour of there being a strong likelihood of existential threat from ASI.
- gjm 19 Sep 2021 12:08 UTC
  2 points
  Parent
  If Loosemore’s point is only that an AI wouldn’t have separate semantics for “interpreting commands” and for “navigating the world and doing things”, then he hasn’t refuted “one principal argument” for ASI danger; he hasn’t refuted any argument for it that doesn’t actually assume that an AI must have separate semantics for those things. I don’t think any of the arguments actually made for ASI danger make that assumption.
  I think the first version of the paperclip-maximizer scenario I encountered had the hapless AI programmer give the AI its instructions (“as many paperclips as possible by tomorrow morning”) and then go to bed, or something along those lines.
  You seem to be conflating “somewhat oddly designed” with “so stupidly designed that no one could possibly think it was a good idea”. I don’t think Loosemore has made anything resembling a strong case for the latter; it doesn’t look to me as if he’s even really tried.
  For Yudkowskian concerns about AGI to be worth paying attention to, it isn’t necessary that there be a “strong likelihood” of disaster if that means something like “at least a 25% chance”. Suppose it turns out that, say, there are lots of ways to make something that could credibly be called an AGI, and if you pick a random one that seems like it might work then 99% of the time you get something that’s perfectly safe (maybe for Loosemore-type reasons) but 1% of the time you get disaster. It seems to me that in this situation it would be very reasonable to have Yudkowsky-type concerns. Do you think Loosemore has given good reason to think that things are much better than that?
  Here’s what seems to me the best argument that he has (but, of course, this is just my attempt at a steelman, and maybe your views are quite different): “Loosemore argues that if you really want to make an AGI then you would have to be very foolish to do it in a way that’s vulnerable to Yudkowsky-type problems, even if you weren’t thinking about safety at all. So potential AGI-makers fall into two classes: the stupid ones, and the ones who are taking approaches that are fundamentally immune to the failure modes Yudkowsky worries about. Yudkowsky hopes for intricate mathematical analyses that will reveal ways to build AGI safely, but the stupid potential AGI engineers won’t be reading those analyses, won’t be able to understand them, and won’t be able to follow their recommendations, and the not-stupid ones won’t need them. So Yudkowsky’s wasting his time.”
  The main trouble with this is that I don’t see that Loosemore has made a good argument that if you really want to make an AGI then you’d be stupid to do it in a way that’s vulnerable to Yudkowsky-type concerns. Also, I think Yudkowsky hopes to find ways of thinking about AI that both make something like provable safety achievable and clarify what’s needed for AI in a way that makes it easier to make an AI at all, in which case, it might not matter what everyone else is doing.
  In any case, this is all a bit of a sidetrack. The point is: Loosemore claimed that the sort of thing Yudkowsky worries about is “logically incoherent at [] a fundamental level”, but even being maximally generous to his arguments I think it’s obvious that he hasn’t shown that; there is a reasonable case to be made that he simply hasn’t understood some of what Yudkowsky has been saying; that is what Y meant by calling L a “permanent idiot”; whether or not detailed analysis of Y’s and L’s arguments ends up favouring one or the other, this is sufficient to suggest that (at worst) what we have here is a good ol’ academic feud where Y has a specific beef with L, which is not at all the same thing as a general propensity for messenger-shooting.
  And, to repeat the actually key point: what Yudkowsky did on one occasion is not strong evidence for what the Less Wrong community at large should be expected to do on a future occasion, and I am still waiting (with little hope) for you to provide some of the actual examples you claim to have where the Less Wrong community at large responded with messenger-shooting to refutations of their central ideas. As mentioned elsewhere in the thread, my attempts to check your claims have produced results that point in the other direction; the nearest things I found to at-all-credibly-claimed refutations of central LW ideas met with positive responses from LW: upvotes, reasonable discussion, no messenger-shooting.
  - Haziq Muhammad 19 Sep 2021 22:21 UTC
    1 point
    Parent
    “(I haven’t downvoted this question nor any of Haziq’s others; but my guess is that this one was downvoted because it’s only a question worth asking if Halpern’s counterexample to Cox’s theorem is a serious problem, which johnswentworth already gave very good reasons for thinking it isn’t in response to one of Haziq’s other questions; so readers may reasonably wonder whether he’s actually paying any attention to the answers his questions get. Haziq did engage with johnswentworth in that other question—but from this question you’d never guess that any of that had happened.)”
    Sorry, haven’t checked LW in a while. I actually came across this comment when I was trying to delete my LW account due to the “shoot the messenger” phenomenon that TAG was describing.
    I do not think that Johnwentworth’s answer is satisfactory. In his response to my previous question, he claims that Cox’s theorem holds under very specific conditions which doesn’t happen in most cases. He also claims that probability as extended logic is justified by empirical evidence. I don’t think this is a good justification unless he happens to have an ACME plausibility-o-meter.
    David Chapman, another messenger you (meaning LW) were too quick to shoot, explains the issues with subjective Bayesianism here:
    https://metarationality.com/probabilism-applicability
    https://metarationality.com/probability-limitations
    I do agree that this framework is useful but only in the same sense that frequentism is useful. I consider myself a ”pragmatic statistician “ who doesn’t hesitate to use frequentist or Bayesian methods, as long as they are useful, because the justifications for either seem to be equally worse.
    ‘It might turn out that the way Cox’s theorem is wrong is that the requirements it imposes for a minimally-reasonable belief system need strengthening, but in ways that we would regard as reasonable. In that case there would still be a theorem along the lines of “any reasonable way of structuring your beliefs is equivalent to probability theory with Bayesian updates”.’
    I find this statement to be quite disturbing because it seems to me that you are assuming Jaynes-Cox theory to be true first and then trying to find a proof for it. Sounds very much like confirmation bias. Van Horn’s paper could potentially revive Cox’s theorem but nobody’s talking about it because they are not ready to accept that Cox’s theorem has any issues in the first place.
    I think the messenger-shooting is quite apparent in LW. It’s the reason why posts that oppose or criticise the “tenets of LW”, that LW members adhere to in a cult-like fashion, are so scarce. For instance, Chapman’s critique of LW seems to have been ignored altogether.
    “The most dangerous ideas in a society are not the ones being argued, but the ones that are assumed.”
    — C. S. Lewis
    - gjm 20 Sep 2021 0:00 UTC
      4 points
      Parent
      I guess there’s not that much point responding to this, since Haziq has apparently now deleted his account, but it seems worth saying a few words.
      Haziq says he’s deleting his account because of LW’s alleged messenger-shooting, but I don’t see any sign that he was ever “shot” in any sense beyond this: one of his several questions received a couple of downvotes.
      What johnswentworth’s answer about Cox’s theorem says isn’t at all that it “holds under very specific conditions which doesn’t happen in most cases”.
      You’ll get no objection from me to the idea of being a “pragmatic statistician”.
      No, I am not at all “assuming Jaynes-Cox theory to be true first and then trying to find a proof for it”. I am saying: the specific scenario you describe (there’s a hole in the proof of Cox’s theorem) might play out in various ways and here are some of them. Some of them would mean something a bit like “Less Wrong is dead” (though, I claim, not exactly that); some of them wouldn’t. I mentioned some of both.
      I can’t speak for anyone else here, but for me Cox’s theorem isn’t foundational in the sort of way it sounds as if you think it is (or should be?). If Cox’s theorem turns out to be disastrously wrong, that would be very interesting, but rather little of my thinking depends on Cox’s theorem. It’s a bit as if you went up to a Christian and said “Is Christianity dead if the ontological argument is invalid?”; most Christians aren’t Christians because they were persuaded by the ontological argument, and I think most Bayesians (in the sense in which LW folks are mostly Bayesians) aren’t Bayesians because they were persuaded by Cox’s theorem.
      I do not know what it would mean to adhere to, say, Cox’s theorem “in a cult-like fashion”.
      The second-last bullet point there is maybe the most important, and warrants a bit more explanation.
      Whether anything “similar enough” to Cox’s theorem is true or not, the following things are (I think) rather uncontroversial:
      We should hold most (maybe all) of our opinions with some degree of uncertainty.
      One possible way of thinking about opinions-with-uncertainty is as probabilities.
      If we think about our beliefs this way, then there are some theorems telling us how to adjust them in the light of new evidence, how our beliefs about various logically-related propositions should be related, etc.
      No obviously-better general approach to quantifying strength of beliefs is known.
      To be clear, this doesn’t mean that nothing is known that is ever better at anything than probability theory with Bayesian updates. E.g., “provably approximately correct learning” isn’t (so far as I know) equivalent to anything Bayesian, and it gives some nice guarantees that (so far as I know) no Bayesian approach is known to give. So when what you want is what PACL theory gives you, you should be using PACL.
      For me, these are sufficient to justify a generally-Bayes-flavoured approach, by which I mean:
      Of course I don’t literally attach numerical probabilities to all my beliefs. Nor do I think it’s obvious that any real reasoner, given the finite resources we inevitably have, should be explicitly probabilistic about everything.
      But if for some reason I need to think clearly about how plausible I find various possibilities, I generally do it in terms of probabilities. (Taking care, e.g., to notice when there is danger of double-counting things, which is an easy way to go wrong when applying Bayesian probability naively.)
      If I notice that some element of how I think is outright inconsistent with this way of quantifying uncertainty, consider whether it’s mistaken (albeit possibly a useful approximation). E.g., most people’s intuitive judgements of how likely things are produce instances of the (poorly named) “conjunction fallacy”, plausibly often because we tend to apply the “representativeness heuristic”; on reflection I think this genuinely does indicate places where our intuitive judgements are mistaken, and trying to notice it happening and do something cleverer is of some value.
      (None of this feels very cultish to me.)
      Finding a bug in the proof of Cox’s theorem doesn’t do anything to invalidate any of the above. Finding a concrete case of a structure other than probabilities-with-Bayesian-updates does better (in some sense of “better”) on a problem resembling actual real-world reasoning absolutely might; in particular, it might make it false that “no obviously-better general approach to quantifying strength of beliefs is known”. Halpern’s counterexample to Cox is not, so far as I can tell, like that; it depends essentially on a sort of “sparsity” that doesn’t hold if what you’re trying to assign credences to is all propositions you might consider about how the world is. I think (indeed, I think Halpern pointed this out, but I may be misremembering) you can fix up the proof by adding an assumption saying that this sort of sparsity doesn’t occur, and although that assumption is ugly and technical I think it is a reasonable assumption in real life; and in the most obvious regime where you can’t make that assumption—where everything is finite—the van Horn approach seems to yield essentially the same conclusions with a pretty modest set of assumptions that are reasonable there.
      So far as I can tell by introspection, I’m not saying all this because I am determined not to admit the possibility that Cox’s theorem might be all wrong. It looks to me like it isn’t all wrong, because I’ve looked at the alleged issues and some alleged patches for them and I think the patches work and the issues are purely technical. But it could be all wrong, and that possibility feels more interesting than upsetting to me.
  - TAG 28 Sep 2021 19:07 UTC
    1 point
    Parent
    
    . I don’t think any of the arguments actually made for ASI danger make that assumption.
    
    I do not think there’s actually a great variety of arguments for existential threat from AI. The arguments other than Dopamine Drip, don’t add up to existential threat.
    
    You seem to be conflating “somewhat oddly designed” with “so stupidly designed that no one could possibly think it was a good idea”
    
    Who would have the best idea of what a stupid design is...the person who has designed AIs or the person who hadn’t? If this were any other topic, you would allow that practical experience counts.
    
    The main trouble with this is that I don’t see that Loosemore has made a good argument
    
    That’s irrelevant. The question is whether his argument is so bad it can be dismissed without being addressed.
    
    Also, I think Yudkowsky hopes to find ways of thinking about AI that both make something like provable safety achievable and clarify what’s needed for AI in a way that makes it easier to make an AI at all, in which case, it might not matter what everyone else is doing.
    
    If pure armchair reasoning works, then it doesn’t matter what everyone else is doing. But why would it work? There’s never been a proof of that—just a reluctance to discuss it.
    - gjm 28 Sep 2021 20:44 UTC
      4 points
      Parent
      Even the “dopamine drip” argument does not make that assumption, even if some ways of presenting it do.
      Loosemore hasn’t designed actually-intelligent AIs, any more than Yudkowsky has. In fact, I don’t see any sign that he’s designed any sort of AIs any more than Yudkowsky has. Both of them are armchair theorists with abstract ideas about how AI ought or ought not to work. Am I missing something? Has Loosemore produced any actual things that could reasonably be called AIs?
      No one was dismissing Loosemore’s argument without addressing it. Yudkowsky dismissed Loosemore having argued with him about AI for years.
      I don’t know what your last paragraph means. I mean, connotationally it’s clear enough: it means “boo, Yudkowsky and his pals are dilettantes who don’t know anything and haven’t done anything valuable”. But beyond that I can’t make enough sense of it to engage with it.
      “If pure armchair reasoning works …”—what does that actually mean? Any sort of reasoning can work or not work. Reasoning that’s done from an armchair (so to speak) has some characteristic failure modes, but it doesn’t always fail.
      “Why would it work?”—what does that actually mean? It works if Yudkowsky’s argument is sound. You can’t tell that by looking at whether he’s sitting in an armchair; it depends on whether its (explicit and implicit) premises are true and whether the logic holds; Loosemore says there’s an implicit premise along the lines of “AI systems will have such-and-such structure” which is false; I say no one really knows much about the structure of actual human-level-or-better AI because no one is close to building one yet, I don’t see where Yudkowsky’s argument actually assumes what Loosemore says it does, and Loosemore’s counterargument is more or less “any human-or-better AI will have to work the way I want it to work, and that’s just obvious” and it isn’t obvious.
      “There’s never been a proof of that”—a proof of what, exactly? A proof that armchair reasoning works? (Again, what would that even mean? Some armchair reasoning works, some doesn’t.)
      “Just a reluctance to discuss it”—seems to me there’s been a fair bit of discussion of Loosemore’s claims on LW. (Including in the very discussion where Yudkowsky called him an idiot.) And, as I understand it, there was a fair bit of discussion between Yudkowsky and Loosemore, but by the time of that discussion Yudkowsky had decided Loosemore wasn’t worth arguing with. This doesn’t look to me like a “reluctance to discuss” in any useful sense. Yudkowsky discussed Loosemore’s ideas with Loosemore for a while and got fed up of doing so. Other LW people discussed Loosemore’s ideas (with Loosemore and I think with one another) and didn’t get particularly fed up. What exactly is the problem here, other than that Yudkowsky was rude?

TAG comments on Is LessWrong dead without Cox’s theorem?

“The most dangerous ideas in a society are not the ones being argued, but the ones that are assumed.”