Raemon comments on Thomas Kwa’s MIRI research experience

Raemon 3 Oct 2023 21:46 UTC
11 points
9
This isn’t quite how I’d frame the question.
[edit: My understanding is that] Eliezer and Nate believe this. I think it’s quite reasonable for other people to be skeptical of it.
Nate and Eliezer can choose to only work closely/mentor people who opt into some kind of confidentiality clause about it. People who are skeptical or don’t think it’s worth the costs can choose not to opt into it.
I have heard a few people talk about MIRI confidentiality norms being harmful to them in various ways, so I do also think it’s quite reasonable for people to be more cautious about opting into working with Nate or Eliezer if they don’t think it’s worth the cost.
Presumably, Nate/Eliezer aren’t willing to talk much about this precisely because they think it’d leak capabilities. You might think they’re wrong, or that they haven’t justified that, but, like, the people who have a stake in this are the people who are deciding whether to work with them. (I think there’s also a question of “should Eliezer/Nate have a reputation as people who have a mindset that’s good for alignment and capabilities that’d be bad to leak?”, and I’d say the answer should be “not any moreso than you can detect from their public writings, and whatever your personal chains of trust with people who have worked closely with them that you’ve talked to.”)
I do think this leaves some problems. I have heard about the MIRI confidentiality norms being fairly paralyzing for some people in important ways. But something about the Muireall’s comment felt like a wrong frame to me.
- So8res 4 Oct 2023 18:29 UTC
  31 points
  23
  Parent
  (I am pretty uncomfortable with all the “Nate / Eliezer” going on here. Let’s at least let people’s misunderstandings of me be limited to me personally, and not bleed over into Eliezer!)
  
  (In terms of the allegedly-extraordinary belief, I recommend keeping in mind jimrandomh’s note on Fork Hazards. I have probability mass on the hypothesis that I have ideas that could speed up capabilities if I put my mind to it, as is a very different state of affairs from being confident that any of my ideas works. Most ideas don’t work!)
  
  (Separately, the infosharing agreement that I set up with Vivek—as was perhaps not successfully relayed to the rest of the team, though I tried to express this to the whole team on various occasions—was one where they owe their privacy obligations to Vivek and his own best judgements, not to me.)
  - Raemon 4 Oct 2023 19:16 UTC
    2 points
    0
    Parent
    (Separately, the infosharing agreement that I set up with Vivek—as was perhaps not successfully relayed to the rest of the team, though I tried to express this to the whole team on various occasions—was one where they owe their privacy obligations to Vivek and his own best judgements, not to me.)
    That’s useful additional information, thanks.
    I made a slight edit to my previous comment to make my epistemic state more clear.
    Fwiw, I feel like I have a pretty crisp sense of “Nate and Eliezers communication styles are actually pretty different” (I noticed myself writing out a similar comment about communication styles under the Turntrout thread that initially said “Nate and Eliezer” a lot, and then decided that comment didn’t make sense to publish as-is), but I don’t actually have much of a sense of the difference between Nate, Eliezer, and MIRI-as-a-whole with regards to “the mindset” and “confidentiality norms”.
- Muireall 3 Oct 2023 22:38 UTC
  1 point
  0
  Parent
  Sure. I only meant to use Thomas’s frame, where it sounds like Thomas did originally accept Nate’s model on some evidence, but now feels it wasn’t enough evidence. What was originally persuasive enough to opt in? I haven’t followed all Nate’s or Eliezer’s public writing, so I’d be plenty interested in an answer that draws only from what someone can detect from their public writing. I don’t mean to demand evidence from behind the confidentiality screen, even if that’s the main kind of evidence that exists.
  Separately, I am skeptical and a little confused as to what this could even look like, but that’s not what I meant to express in my comment.