jessicata comments on Occupational Infohazards

jessicata 23 Dec 2021 1:01 UTC
12 points
1
Thanks for giving your own model and description of the situation!

Regarding latent tendency, I don’t have a family history of psychosis (but I do of bipolar), although that doesn’t rule out latent tendency. It’s unclear what “latent tendency” means exactly, it’s kind of pretending that the real world is a 3-node Bayesian network (self tendency towards X, environment tendency towards inducing X, whether X actually happens) rather than a giant web of causality, but maybe there’s some way to specify it more precisely.

I think the 4 factors you listed are the vast majority, so I partially agree with your “red herring” claim.

The “woo” language was causal, I think, mostly because I feared that others would apply to coercion to me if I used it too much (even if I had a more detailed model that I could explain upon request), and there was a bad feedback loop around thinking that I was crazy and/or other people would think I was crazy, and other people playing into this.

I think I originally wrote about basilisk type things in the post because I was very clearly freaking out about abstract evil at the time of psychosis (basically a generalization of utility function sign flips), and I thought Scott’s original comment would have led people to think I was thinking about evil mainly because of Michael, when actually I was thinking about evil for a variety of reasons. I was originally going to say “maybe all this modeling of adversarial/evil scenarios at my workplace contributed, but I’m not sure” but an early reader said “actually wait, based on what you’ve said what you experienced later was a natural continuation of the previous stuff, you’re very much understating things” and suggested (an early version of) the last paragraph of the basilisk section, and that seemed likely enough to include.

It’s pretty clear that thinking about basilisk-y scenarios in the abstract was part of MIRI’s agenda (e.g. the Arbital article). Here’s a comment by Rob Bensinger saying it’s probably bad to try to make an AI that does a lot of interesting stuff and has a good time doing it, because that objective is too related to consicousness and that might create a lot of suffering. (That statement references the “s-risk” concept and if someone doesn’t know what that is and tries to find out, they could easily end up at a Brian Tomasik article recommending thinking about what it’s like to be dropped in lava.)

The thing is it seems pretty hard to evaluate an abstract claim like Rob’s without thinking about details. I get that there are arguments against thinking about the details (e.g. it might drive you crazy or make you more extortable) but natural ways of thinking about the abstract question (e.g. imagination / pattern completion / concretization / etc) would involve thinking about details even if people at MIRI would in fact dis-endorse thinking about the details. It would require a lot of compartmentalization to think about this question in the abstract without thinking about the details, and some people are more disposed to do that than others, and I expect compartmentalization of that sort to cause worse FAI research, e.g. because it might lead to treating “human values” as a LISP token.

[EDIT: Just realized Buck Shlegeris (someone who recently left MIRI) recently wrote a post called “Worst-case thinking in AI alignment”… seems concordant with the point I’m making.]
- gallabytes 23 Dec 2021 16:16 UTC
  7 points
  2
  Parent
  hmm… this could have come down to spending time in different parts of MIRI? I mostly worked on the “world’s last decent logic department” stuff—maybe the more “global strategic” aspects of MIRI work, at least the parts behind closed doors I wasn’t allowed through, were more toxic? Still feels kinda unlikely but I’m missing info there so it’s just a hunch.
  - jessicata 23 Dec 2021 16:45 UTC
    5 points
    0
    Parent
    My guess is that it has more to do with willingness to compartmentalize than part of MIRI per se. Compartmentalization is negatively correlated with “taking on responsibility” for more of the problem. I’m sure you can see why it would be appealing to avoid giving into extortion in real life, not just on whiteboards, and attempting that with a skewed model of the situation can lead to outlandish behavior like Ziz resisting arrest as hard as possible.
    - gallabytes 24 Dec 2021 0:12 UTC
      1 point
      2
      Parent
      I think this is a persistent difference between us but isn’t especially relevant to the difference in outcomes here.
      
      I’d more guess that the reason you had psychoses and I didn’t had to do with you having anxieties about being irredeemably bad that I basically didn’t at the time. Seems like this would be correlated with your feeling like you grew up in a Shin Sekai Yori world?
      - jessicata 24 Dec 2021 3:03 UTC
        11 points
        2
        Parent
        I clearly had more scrupulosity issues than you and that contributed a lot. Relevantly, the original Roko’s Basilisk post is putting AI sci-fi detail on a fear I am pretty sure a lot of EAs feel/felt in their heart, that something nonspecifically bad will happen to them because they are able to help a lot of people (due to being pivotal on the future), and know this, and don’t do nearly as much as they could. If you’re already having these sorts of fears then the abstract math of extortion and so on can look really threatening.