I hereby push back against the (implicit) narrative that I find the standard community norms costly, or that my communication protocols are “alternative”.
My model is closer to: the world is a big place these days, different people run on different conversation norms. The conversation difficulties look, to me, symmetric, with each party violating norms that the other considers basic, and failing to demonstrate virtues that the other considers table-stakes.
(To be clear, I consider myself to bear an asymmetric burden of responsibility for the conversatiosn going well, according to my seniority, which is why I issue apologies instead of critiques when things go off the rails.)
Separately but relatedly: I think the failure-mode I had with Vivek & co was rather different than the failure-mode I had with you. In short: in your case, I think the issue was rooted in a conversational dynamic that caused me frustration, whereas in Vivek & co’s case, I think the issue was rooted in a conversational dynamic that caused me despair.
Which is not to say that the issues are wholly independent; my guess is that the common-cause is something like “some people take a lot of damage from having conversations with someone who despairs of the conversation”.
Tying this back: my current model of the situation is not that I’m violating community norms about how to have a conversation while visibly hopeless, but am rather in uncharted territory by trying to have those conversations at all.
(For instance: standard academia norms as I understand them are to lie to yourself and/or others about how much hope you have in something, and/or swallow enough of the modesty-pill that you start seeing hope in places I would not, so as to sidestep the issue altogether. Which I’m not personally up for.)
([tone: joking but with a fragment of truth] …I guess that the other norm in academia when academics are hopless about others’ research is “have feuds”, which… well we seem to be doing a fine job by comparison to the standard norms, here!)
Where, to be clear, I already mostly avoid conversations where I’m hopeless! I’m mostly a hermit! The obvious fix of “speak to fewer people” is already being applied!
And beyond that, I’m putting in rather a lot of work (with things like my communication handbook) to making my own norms clearer, and I follow what I think are good meta-norms of being very open to trying other people’s alternative conversational formats.
I’m happy to debate what the local norms should be, and to acknowledge my own conversational mistakes (of which I have made plenty), but I sure don’t buy a narrative that I’m in violation of the local norms.
(But perhaps I will if everyone in the comments shouts me down! Local norms are precisely the sort of thing that I can learn about by everyone shouting me down about this!)
I sure don’t buy a narrative that I’m in violation of the local norms.
This is preposterous.
I’m not going to discuss specific norms. Discussing norms with Nate leads to an explosion of conversational complexity.[1] In my opinion, such discussion can sound really nice and reasonable, until you remember that you just wanted him to e.g. not insult your reasoning skills and instead engage with your object-level claims… but somehow your simple request turns into a complicated and painful negotiation. You never thought you’d have to explain “being nice.”
Then—in my experience—you give up trying to negotiate anything from him and just accept that he gets to follow whatever “norms” he wants.
So, in order to evaluate whether Nate is “following social norms”, let’s not think about the norms themselves. I’m instead going to share some more of the interactions Nate has had:
A chat with me, where Nate describes himself as “visibly flustered, visibly frustrated, had a raised voice, and was being mean in various of his replies.”
An employee reporting fear and dreadat the prospect of meeting with Nate,
People crying while talking with Nate about research,
ETA: Peter Barnett writes “I think it’s kind of hard to see how bad/annoying/sad this is until you’re in it.”
(There are people who want to speak out, but have not yet. I am not at liberty to detail their cases myself.)
Regarding my own experience: I would destroy 10% of my liquidity to erase from my life that conversation and its emotional effects.
I don’t think it’s reasonable to expect Nate to have predicted that I in particular would be hurt so much. But in fact, being unexpectedly and nonconsensually mean and aggressive and hurtful has heavy negative tails.
And so statistically, this level of harmis totally predictable, both on priors and based off of past experience which I know Nate has.
This seems like a fairly extreme statement, so I was about to upvote due to the courage required to post it publicly and stand behind it. But then I stopped and thought about the long term effects and it’s probably best not to encourage this.
As ideally, you, along with the vast majority of potential readers, should become less emotionally reactive over time to any real or perceived insults, slights, etc...
If it’s the heat of the moment talking, that’s fine, but letting thoughts of payback, revenge, etc., linger on for days afterwards likely will not lead to any positive outcome.
As ideally, you, along with the vast majority of potential readers, should become less emotionally reactive over time to any real or perceived insults, slights, etc...
If it’s the heat of the moment talking, that’s fine, but letting thoughts of payback, revenge, etc., linger on for days afterwards likely will not lead to any positive outcome.
I have had these thoughts many times. I would berate myself for letting it get on my nerves so much. It was just an hour-and-a-half chat. But I don’t think it’s a matter of “letting” thoughts occur, or not. Certain situations are damaging to certain people, and this situation isn’t a matter of whether people are encouraged to be damaged or not (I certainly had no expectation of writing about this, back in July–October 2022.)
I upvoted for effort because it’s clear you put in quite a bit of effort in writing this comment, but skipped expressing agreement or disagreement.
I had thought of several possible responses, and it is worthy of a substantial response, but since it’s not my role to be the adjudicator or corrector of LW users, I’ll pose you this question:
Consider, is it possible for him to take offence in return and then retaliate via some mean(s)? If that does occur, what’s the range of likely outcomes?
(Note that I moved most of the original comment and plan to put it elsewhere in the thread.)
Consider, is it possible for him to take offence in return and then retaliate via some mean(s)? If that does occur, what’s the range of likely outcomes?
I don’t follow. I’m not going to behave differently in the face of any possible retaliation, nor do I in fact expect Nate to retaliate in an inappropriate manner. So I’m not worried about this?
[...] I was about to upvote due to the courage required to post it publicly and stand behind it. But then I stopped and thought about the long term effects and it’s probably best not to encourage this. [...] As ideally, you, along with the vast majority of potential readers, should become less emotionally reactive over time to any real or perceived insults, slights, etc...
It seems weird to single out this specific type of human limitation (compared to perfect-robot instrumental rationality) over the hundreds of others. If someone isn’t in top physical shape or cannot drive cars under difficult circumstances or didn’t renew their glasses and therefore doesn’t see optimally, would you also be reluctant to upvote comments you were otherwise tempted to upvote (where they bravely disclose some limitation) because of this worry about poor incentives? “Ideally,” in a world where there’s infinite time so there are no tradeoffs for spending self-improvement energy, rationalists would all be in shape, have brushed up their driving skills, have their glasses updated, etc. In reality, it’s perfectly fine/rational to deprioritize many things that are “good to have” because other issues are more pressing, more immediately deserving of self-improvement energy. (Not to mention that rationality for its own sake is lame anyway and so many of us actually want to do object-level work towards a better future.) What to best focus on with self-improvement energy will differ a lot from person to person, not only because people have different strengths and weaknesses, but also because they operate in different environments. (E.g., in some environments, one has to deal with rude people all the time, whereas in others, this may be a rare occurrence.) For all these reasons, it seems weirdly patronizing to try to shape other people’s prioritization for investing self-improvement energy. This isn’t to say that this site/community shouldn’t have norms and corresponding virtues and vices. Since LW is about truth-seeking, it makes sense to promote virtues directly related to truth-seeking, e.g., by downvoting comments that exhibit poor epistemic practices. However, my point is that even though it might be tempting to discourage not just poor epistemic rationality but also poor instrumental rationality, these two work very differently, especially as far as optimal incentive-setting is concerned. Epistemic rationality is an ideal we can more easily enforce and get closer towards. Instrumental rationality, by contrast, is a giant jungle that people are coming into from all kinds of different directions. “Having unusually distracting emotional reactions to situations xyz” is one example of suboptimal instrumental rationality, but so is “being in poor physical shape,”or “not being able to drive a car,” or “not having your glasses updated,” etc. I don’t think it makes sense for the community to create a hierarchy of “most important facets of instrumental rationality” that’s supposed to apply equally to all kinds of people. (Instead, I think it makes more sense to reward meta-skills of instrumental rationality, such as “try to figure out what your biggest problems are and really prioritize working on them.) (If we want to pass direct judgment on someone’s prioritization of self-improvement energy, we need to know their exact situation and goals and the limitations they have, how good they are at learning various things, etc.) Not to mention the unwelcoming effects when people get judged for limitations of instrumental rationality that the community for some reason perceives to be particularly bad. Such things are always more personal (and therefore more unfair) than judging someone for having made a clear error of reasoning (epistemic rationality).
(I say all of this as though it’s indeed “very uncommon” to feel strongly hurt and lastingly affected by particularly harsh criticism. I don’t even necessarily think that this is the case: If the criticism comes from a person with high standing in a community one cares about, it seems like a potentially quite common reaction?)
I say all of this as though it’s indeed “very uncommon” to feel strongly hurt and lastingly affected by particularly harsh criticism. I don’t even necessarily think that this is the case: If the criticism comes from a person with high standing in a community one cares about, it seems like a potentially quite common reaction?
This is relevant context for my strong reaction. I used to admire Nate, and so I was particularly upset when he treated me disrespectfully. (The experience wasn’t so much “criticism” as “aggression and meanness”, though.)
FWIW, I also reject the framing that this situation is reasonably understood as an issue with my own instrumental rationality.
Going back to the broader point about incentives, it’s not very rewarding to publicly share a distressing experience and thereby allow thousands of internet strangers to judge my fortitude, and complain if they think it lacking. I’m not walking away from this experience feeling lavished and reinforced for having experienced an emotional reaction.
Furthermore, the reason I spoke up was mostly not to litigate my own experience. It’s because I’ve spent months witnessing my friends take unexpected damage from a powerful individual who appears to have faced basically no consequences for his behavior.
It seems weird to single out this specific type of human limitation (compared to perfect-robot instrumental rationality) over the hundreds of others.
This is a minor error but I feel the need to correct it for future readers, as it’s in the first sentence. There are infinitely many ‘specific types’ of human limitations, or at least an uncountable quantity , depending on the reader’s preferred epistemology.
The rest of your thesis is interesting though a bit difficult to parse. Could you isolate a few of the key points and present them in a list?
I wasn’t the one who downvoted your reply (seems fair to ask for clarifications), but I don’t want to spend much more time on this and writing summaries isn’t my strength. Here’s a crude attempt at saying the same thing in fewer and different words:
IMO, there’s nothing particularly “antithetical to LW aims/LW culture” (edit: “antithetical to LW aims/LW culture” is not a direct quote by anyone; but it’s my summary interpretation of why you might be concerned about bad incentives in this case) about neuroticism-related “shortcomings.” “Shortcomings” compared to a robotic ideal of perfect instrumental rationality. By “neuroticism-related “shortcomings”″, I mean things like having triggers or being unusually affected by harsh criticism. It’s therefore weird and a bit unfair to single out such neuroticism-related “shortcomings” over things like “being in bad shape” or “not being good at common life skills like driving a car.” (I’m guessing that you wouldn’t be similarly concerned about setting bad incentives if someone admitted that they were bad at driving cars or weren’t in the best shape.) I’m only guessing here, but I wonder about rationalist signalling cascades about the virtues of rationality, where it gets rewarded to be particularly critical about things that least correspond to the image of what an ideally rational robot would be like. However, in reality, applied rationality isn’t about getting close to some ideal image. Instead, it’s about making the best out of what you have, taking the best next move step-by-step for your specific situation, always prioritizing what actually gets you to your goals rather than prioritizing “how do I look as though I’m very rational.”
Not to mention that high emotionality confers advantages in many situations and isn’t just an all-out negative. (See also TurnTrout’s comment about rejecting the framing that this is an issue of his instrumental rationality being at fault.)
I don’t mind the occasional downvote or negative karma, it even has some positive benefits, such as being a useful signalling function. As it’s decent evidence I haven’t tailored my comments for popularity or platitudes.
In regards to your points, I’ll only try to respond to them one at a time, since this is already pretty far down the comment chain.
IMO, there’s nothing particularly “antithetical to LW aims/LW culture” about neuroticism-related “shortcomings” (compared to a robotic ideal of perfect instrumental rationality) like having triggers or being unusually affected by harsh criticism.
Who suggested that there was a relation between being “antithetical to LW aims/LW culture” and “neuroticism-related “shortcomings”″?
i.e. Is it supposed to be my idea, TurnTrout’s, your’s, a general sentiment, something from the collective unconscious, etc.?
I made an edit to my above comment to address your question; it’s probably confusing that I used quotation marks for something that wasn’t a direct quote by anyone.
as a kind of semantic brackets; I think the official way to do this is to write-the-words-connected-by-hyphens, but that just seems hard to read;
to remove a possible connotation, i.e. to signal that I am using the word not exactly as most people would probably use it in a similar situation;
or as a combination of both, something like: I am using these words to express an idea, but these are probably not the right words, but I can’t find any better, so please do not take this part literally and don’t start nitpicking (don’t assume that I used a specific word because I wanted to hint at something specific).
For example, as I understand it,
“Shortcomings” compared to a robotic ideal of perfect instrumental rationality.
Means: things that technically are shortcomings (because they deviate from some ideal), but also a reasonable person wouldn’t call them so (because it is a normal human behavior, and I would actually be very suspicious about anyone who claimed to not have any of them), so the word is kinda correct but also kinda incorrect. But it is a way to express what I mean.
But to clarify, this is not the reason why I ‘might be concerned about bad incentives in this case’, if you were wondering.
Sounds like I misinterpreted the motivation behind your original comment!
I ran out of energy to continue this thread/conversation, but feel free to clarify what you meant for others (if you think it isn’t already clear enough for most readers).
(For completeness, I want to note that I’ve talked with a range of former/current MIRI employees, and a non-trivial fraction did have basically fine interactions with Nate.)
According to my notes and emails, Nate repeatedly said things like “I have not yet ruled out [uncharitable hypothesis about how TurnTrout is reasoning]” in order to—according to him—accomplish his conversational objectives / because his “polite” statements were apparently not getting his point across. I don’t remember what specific uncharitable things he said (the chat was on 7/19/22).
Huh, I initially found myself surprised that Nate thinks he’s adhering to community norms. I wonder if part of what’s going on here is that “community norms” is a pretty vague phrase that people can interpret differently.
Epistemic status: Speculative. I haven’t had many interactions with Nate, so I’m mostly going off of what I’ve heard from others + general vibes.
Some specific norms that I imagine Nate is adhering to (or exceeding expectations in):
Honesty
Meta-honesty
Trying to offer concrete models and predictions
Being (internally) open to acknowledging and recognizing mistakes, saying oops, etc.
Some specific norms that I think Nate might not be adhering to:
Engaging with people in ways such that they often feel heard/seen/understood
Engaging with people in ways such that they rarely feel dismissed/disrespected
Something fuzzy that lots of people would call “kindness” or “typical levels of warmth”
I’m guessing that some people think that social norms dictate something like “you are supposed to be kind and civil and avoid making people unnecessarily sad/insecure/defensive.” I wonder if Nate (a) believes that these are community norms and thinks he’s following them or (b) just doesn’t think these are community norms in the first place.
Tying this back: my current model of the situation is not that I’m violating community norms about how to have a conversation while visibly hopeless, but am rather in uncharted territory by trying to have those conversations at all.
I think this explains some of the effect, but not all of it. In academia, for instance, I think there are plenty of conversations in which two researchers (a) disagree a ton, (b) think the other person’s work is hopeless or confused in deep ways, (c) honestly express the nature of their disagreement, but (d) do so in a way where people generally feel respected/valued when talking to them.
Like, it’s certainly easier to make people feel heard/seen if you agree with them a bunch and say their ideas are awesome, but of course that would be dishonest [for Nate].
So I could see a world where Nate is like “Darn, the reason people seem to find communicating with me to be difficult is that I’m just presenting them with the harsh truths, and it is indeed hard for people to hear harsh truths.”
But I think some people possess the skill of “being able to communicate harsh truths accurately in ways where people still find the interaction kind, graceful, respectful, and constructive.” And my understanding is that’s what people like TurnTrout are wishing for.
Engaging with people in ways such that they often feel heard/seen/understood
This is not a reasonable norm. In some circumstances (including, it sounds like, some of the conversations under discussion) meeting this standard would require a large amount of additional effort, not related to the ostensible reason for talking in the first place.
Engaging with people in ways such that they rarely feel dismissed/disrespected
Again, a pretty unreasonable norm. For some topics, such as “is what you’re doing actually making progress towards that thing you’ve arranged your life (including social context) around making progress on?”, it’s very easy for people to feel this way, even if they are being told true, useful, relevant things.
Something fuzzy that lots of people would call “kindness” or “typical levels of warmth”
Ditto, though significantly less strongly; I do think there’s ways to do this that stay honest and on-mission without too much tradeoff.
I think it’s not a reasonable norm to make sure your interlocutors never e.g. feel dismissed/disrespected, but it is reasonable to take some measures to avoid having someone consistently feel dismissed/disrespected if you spend over 200 hours talking with their team and loosely mentoring them (which to be clear Nate did, it’s just difficult in his position and so was only mildly successful).
I’m not sure kindness/warmth should even be a norm because it’s pretty difficult to define.
The details matter here; I don’t feel I can guess from what you’ve said whether we’d agree or not.
For example:
Tam: says some idea about alignment
Newt: says some particular flaw ”...and this is an instance of a general problem, which you’ll have to address if you want to make progress...” gestures a bit at the general problem
Tam: makes a tweak to the proposal that locally addresses the particular flaw
Newt: “This still doesn’t address the problem.”
Tam: “But it seems to solve the concrete problem, at least as you stated it. It’s not obvious to me that there’s a general problem here; if we can solve instances of it case-by-case, that seems like a lot of progress.”
Newt: “Look, we could play this game for some more rounds, where you add more gears and boxes to make it harder to see that there’s a problem that isn’t being addressed at all, and maybe after a few rounds you’ll get the point. But can we just skip ahead to you generalizing to the class of problem, or at least trying to do that on your own?”
Tam: feels dismissed/disrespected
I think Newt could have been more graceful and more helpful, e.g. explicitly stating that he’s had a history of conversations like this, and setting boundaries about how much effort he feels exciting about putting in, and using body language that is non-conflictual… But even if he doesn’t do that, I don’t really think he’s violating a norm here. And depending on context this sort of behavior might be about as well as Newt can do for now.
You can choose to ignore all these “unreasonable norms”, but they still have consequences. Such as people thinking you are an asshole. Or leaving the organization because of you. It is easy to underestimate these costs, because most of the time people won’t tell you (or they will, but you will ignore them and quickly forget).
This is a cost that people working with Nate should not ignore, even if Nate does.
I see three options:
try making Nate change—this may not be possible, but I think it’s worth trying;
isolate Nate from… well, everyone else, except for volunteers who were explicitly warned;
hire a separate person whose full time job will be to make Nate happy.
Anything else, I am afraid, will mean paying the costs and most likely being in denial about them.
I see at least two other options (which, ideally, should be used in tandem):
don’t hire people who are so terribly sensitive to above-average blutness
hire managers who will take care of ops/personnel problems more effectively, thus reducing the necessity for researchers to navigate interpersonal situations that arise from such problems
don’t hire people who are so terribly sensitive to above-average blutness
If I translate it mentally to “don’t hire people from the bottom 99% of thick skin”, I actually agree. Though they may be difficult to find, especially in combination with other requirements.
Do you really think it’d take 99th percentile skin-thickness to deal with this sort of thing without having some sort of emotional breakdown? This seems to me to be an extraordinary claim.
Are you available for the job? ;-)
While I probably qualify in this regard, I don’t think that I have any other relevant qualifications.
My experience is that people who I think of as having at least 90th percentile (and probably 99th if I think about it harder) thick-skin have been brought to tears from an intense conversation with Nate.
My guess is that this wouldn’t happen for a lot of possible employees from the broader economy, and this isn’t because they’ve got thicker skin, but it’s because they’re not very emotionally invested in the organization’s work, and generally don’t bring themselves to their work enough to risk this level of emotion/hurt.
My experience is that people who I think of as having at least 90th percentile (and probably 99th if I think about it harder) thick-skin have been brought to tears from an intense conversation with Nate.
This is a truly extraordinary claim! I don’t know what evidence I’d need to see in order to believe it, but whatever that evidence is, I sure haven’t seen it yet.
My guess is that this wouldn’t happen for a lot of possible employees from the broader economy, and this isn’t because they’ve got thicker skin, but it’s because they’re not very emotionally invested in the organization’s work, and generally don’t bring themselves to their work enough to risk this level of emotion/hurt.
This just can’t be right. I’ve met a decent number of people who are very invested in their work and the mission of whatever organization they’re part of, and I can’t imagine them being brought to tears by “an intense conversation” with one of their co-workers (nor have I heard of such a thing happening to the people I have in mind).
Something else is going on here, it seems to me; and the most obvious candidate for what that “something else” might be is simply that your view of what the distribution of “thick-skinned-ness” is like, is very mis-calibrated.
(Don’t know why some folks have downvoted the above comment, seems like a totally normal epistemic state for Person A not to believe what Person B believes about something after simply learning that Person B believes it, and to think Person B is likely miscalibrated. I have strong upvoted the comment back to clearly positive.)
In academia, for instance, I think there are plenty of conversations in which two researchers (a) disagree a ton, (b) think the other person’s work is hopeless or confused in deep ways, (c) honestly express the nature of their disagreement, but (d) do so in a way where people generally feel respected/valued when talking to them.
My model says that this requires them to still be hopeful about local communication progress, and happens when they disagree but already share a lot of frames and concepts and background knowledge. I, at least, find it much harder when I don’t expect the communciation attempt to make progress, or have positive effect.
(“Then why have the conversation at all?” I mostly don’t! But sometimes I mispredict how much hope I’ll have, or try out some new idea that doesn’t work, or get badgered into it.)
Some specific norms that I think Nate might not be adhering to:
Engaging with people in ways such that they often feel heard/seen/understood
Engaging with people in ways such that they rarely feel dismissed/disrespected
Something fuzzy that lots of people would call “kindness” or “typical levels of warmth”
These sound more to me like personality traits (that members of the local culture generally consider virtuous) than communication norms.
On my model, communciation norms are much lover-level than this. Basics of rationalist discourse seem closer; archaic politeness norms (“always refuse food thrice before accepting”) are an example of even lower-level stuff.
My model, speaking roughly and summarizing a bunch, says that the lowest-level stuff (atop a background of liberal-ish internet culture and basic rationalist discourse) isn’t pinned down on account of cultural diversity, so we substitute with meta-norms, which (as best I understand them) include things like “if your convo-partner requests a particular conversation-style, either try it out or voice objections or suggest alternatives” and “if things aren’t working, retreat to a protected meta discussion and build a shared understanding of the issue and cooperatively address it”.
I acknowledge that this can be pretty difficult to do on the fly, especially if emotions are riding high. (And I think we have cultural diversity around whether emotions are ever supposed to ride high, and if so, under what circumstances.) On my model of local norms, this sort of thing gets filed under “yep, communicating in the modern world can be rocky; if something goes wrong then you go meta and try to figure out the causes and do something differently next time”. (Which often doesn’t work! In which case you iterate, while also shifting your conversational attention elsewhere.)
To be clear, I buy a claim of the form “gosh, you (Nate) seem to run on a relatively rarer native emotional protocol, for this neck of the woods”. My model is that local norms are sufficiently flexible to continue “and we resolve that by experimentation and occasional meta”.
And for the record, I’m pretty happy to litigate specific interactions. When it comes to low-level norms, I think there are a bunch of conversational moves that others think are benign that I see as jabs (and which I often endorse jabbing back against, depending on the ongoing conversation style), and a bunch of conversational moves that I see as benign that others take as jabs, and I’m both (a) happy to explicate the things that felt to me like jabs; (b) happy to learn what other people took as jabs; and (c) happy to try alternative communication styles where we’re jabbing each other less. Where this openness-to-meta-and-trying-alternative-things seems like the key local meta-norm, at least in my understanding of local culture.
My model is that local norms are sufficiently flexible to continue “and we resolve that by experimentation and occasional meta”.
It seems to me that in theory it should be possible to have very unusual norms and make it work, but that in practice you and your organization horribly underestimate how difficult it is to communicate such things clearly (more than once, because people forget or don’t realize the full implications at the first time). You assume that the local norms were made perfectly clear, but they were not (expecting short inferential distances, doubleillusion of transparency, etc.).
Did you expect KurtB to have this kind of reaction, to post this kind of comment, and to get upvoted? If the answer is no, it means your model is wrong somewhere.
(If the answer is yes, maybe you should print that comment, and give a copy to all new employees. That might dramatically reduce a possibility of misunderstanding.)
These sound more to me like personality traits (that members of the local culture generally consider virtuous) than communication norms.
My original comment is not talking about communication norms. It’s talking about “social norms” and “communication protocols” within those norms. I mentioned “basic respectfulness and professionalism.”
But I think some people possess the skill of “being able to communicate harsh truths accurately in ways where people still find the interaction kind, graceful, respectful, and constructive.” And my understanding is that’s what people like TurnTrout are wishing for.
This is a thing, but I’m guessing that what you have in mind involves a lot more than you’re crediting of not actually trying for the crux of the conversation. As just one example, you can be “more respectful” by making fewer “sweeping claims” such as “you are making such and such error in reasoning throughout this discussion / topic / whatever”. But that’s a pretty important thing to be able to say, if you’re trying to get to real cruxes and address despair and so on.
But I think some people possess the skill of “being able to communicate harsh truths accurately in ways where people still find the interaction kind, graceful, respectful, and constructive.” And my understanding is that’s what people like TurnTrout are wishing for.
Kinda. I’m advocating less for the skill of “be graceful and respectful and constructive” and instead looking at the lower bar of “don’t be overtly rude and aggressive without consent; employ (something within 2 standard deviations of) standard professional courtesy; else social consequences.” I want to be clear that I’m notwishing for some kind of subtle mastery, here.
I’m putting in rather a lot of work (with things like my communication handbook) to making my own norms clearer, and I follow what I think are good meta-norms of being very open to trying other people’s alternative conversational formats.
Nate, I am skeptical.
As best I can fathom, you put in very little work to proactively warn new hires about the emotional damage which your employees often experience. I’ve talked to a range of people who have had professional interactions with you, both recently and further back. Only one of the recent cases reported that you warned them before they started working with you.
In particular, talking to the hires themselves, I have detected no evidence that you have proactively warned most of the hires[1] you’ve started working with since July 2022, which is when:
I told you that your anger and ranting imposed unexpected and large costs on me,
And you responded with something like “Sorry, I’ll make sure to tell people I have research conversations with—instead of just my formal collaborators. Obvious in hindsight.”
And yet you apparently repeatedly did not warn most of your onboarded collaborators.
EDIT: The original version of this comment claimed “None were warned.” This was accurate reporting at the time of my comment. However, I now believe that Nate did in fact proactively warn Vivek, at least a bit and to some extent. I am overall still worried, as I know of several specific cases which lacked sufficient warning, and some of them paid surprisingly high costs because of it.
Vivek did not recall any warning but thought it possible that you had verbally mentioned some cons, which he forgot about. However, now that Nate jogged his memory a bit, he thinks he probably did receive a warning.
I personally wouldn’t count a forgettable set of remarks as “sufficient warning”, given the track record here. It sure seems to me like it should have been memorable.
On the facts: I’m pretty sure I took Vivek aside and gave a big list of reasons why I thought working with me might suck, and listed that there are cases where I get real frustrated as one of them. (Not sure whether you count him as “recent”.)
My recollection is that he probed a little and was like “I’m not too worried about that” and didn’t probe further. My recollection is also that he was correct in this; the issues I had working with Vivek’s team were not based in the same failure mode I had with you; I don’t recall instances of me getting frustrated and bulldozey (though I suppose I could have forgotten them).
(Perhaps that’s an important point? I could imagine being significantly more worried about my behavior here if you thought that most of my convos with Vivek’s team were like most of my convos with you. I think if an onlooker was describing my convo with you they’d be like “Nate was visibly flustered, visibly frustrated, had a raised voice, and was being mean in various of his replies.” I think if an onlooker was describing my convos with Vivek’s team they’d be like “he seemed sad and pained, was talking quietly and as if choosing the right words was a struggle, and would often talk about seemingly-unrelated subjects or talk in annoying parables, while giving off a sense that he didn’t really expect any of this to work”. I think that both can suck! And both are related by a common root of “Nate conversed while having strong emotions”. But, on the object level, I think I was in fact avoiding the errors I made in conversation with you, in conversation with them.)
As to the issue of not passing on my “working with Nate can suck” notes, I think there are a handful of things going on here, including the context here and, more relevantly, the fact that sharing notes just didn’t seem to do all that much in practice.
I could say more about that; the short version is that I think “have the conversation while they’re standing, and I’m lying on the floor and wearing a funny hat” seems to work empirically better, and...
hmm, I think part of the issue here is that I was thinking like “sharing warnings and notes is a hypothesis, to test among other hypotheses like lying on the floor and wearing a funny hat; I’ll try various hypotheses out and keep doing what seems to work”, whereas (I suspect) you’re more like “regardless of what makes the conversations go visibly better, you are obligated to issue warnings, as is an important part of emotionally-bracing your conversation partners; this is socially important if it doesn’t seem to change the conversation outcomes”.
I think I’d be more compelled by this argument if I was having ongoing issues with bulldozing (in the sense of the convo we had), as opposed to my current issue where some people report distress when I talk with them while having emotions like despair/hoplessness.
I think I’d also be more compelled by this argument if I was more sold on warnings being the sort of thing that works in practice.
Like… (to take a recent example) if I’m walking by a whiteboard in rosegarden inn, and two people are like “hey Nate can you weigh in on this object-level question”, I don’t… really believe that saying “first, be warned that talking techincal things with me can leave you exposed to unshielded negative-valence emotions (frustration, despair, …), which some people find pretty crappy; do you still want me to weigh in?” actually does much. I am skeptical that people say “nope” to that in practice.
I suppose that perhaps what it does is make people feel better if, in fact, it happens? And maybe I’ll try it a bit and see? But I don’t want to sound like I’m promising to do such a thing reliably even as it starts to feel useless to me, as opposed to experimenting and gravitating towards things that seem to work better like “offer to lie on the floor while wearing a funny hat if I notice things getting heated”.
I’ve been asked to clarify a point of fact, so I’ll do so here:
My recollection is that he probed a little and was like “I’m not too worried about that” and didn’t probe further.
This does ring a bell, and my brain is weakly telling me it did happen on a walk with Nate, but it’s so fuzzy that I can’t tell if it’s a real memory or not. A confounder here is that I’ve probably also had the conversational route “MIRI burnout is a thing, yikes” → “I’m not too worried, I’m a robust and upbeat person” multiple times with people other than Nate.
In private correspondence, Nate seems to remember some actual details, and I trust that he is accurately reporting his beliefs. So I’d mostly defer to him on questions of fact here.
I’m pretty sure I’m the person mentioned in TurnTrout’s footnote. I confirm that, at the time he asked me, I had no recollection of being “warned” by Nate but thought it very plausible that I’d forgotten.
This is a slight positive update for me. I maintain my overall worry and critique: chats which are forgettable do not constitute sufficient warning.
Insofar as non-Nate MIRI personnel thoroughly warned Vivek, that is another slight positive update, since this warning should reliably be encountered by potential hires. If Vivek was independently warned via random social connections not possessed by everyone,[1] then that’s a slight negative update.
I think I’d also be more compelled by this argument if I was more sold on warnings being the sort of thing that works in practice.
Like… (to take a recent example) if I’m walking by a whiteboard in rosegarden inn, and two people are like “hey Nate can you weigh in on this object-level question”, I don’t… really believe that saying “first, be warned that talking techincal things with me can leave you exposed to unshielded negative-valence emotions (frustration, despair, …), which some people find pretty crappy; do you still want me to weigh in?” actually does much. I am skeptical that people say “nope” to that in practice.
I think there are several critical issues with your behavior, but I think the most urgent is that people often don’t know what they’re getting into. People have a right to make informed decisions and to not have large, unexpected costs shunted onto them.
It’s true that no one has to talk with you. But it’s often not true that people know what they’re getting into. I spoke out publicly because I encountered a pattern, among my friends and colleagues, of people taking large and unexpected emotional damage from interacting with you.
If our July interaction had been an isolated incident, I still would have been quite upset with you, but I would not have been outraged.
If the pattern I encountered were more like “a bunch of people report high costs imposed by Nate, but basically in the ways they expected”, I’d be somewhat less outraged.[1] If people can accurately predict the costs and make informed decisions, then people who don’t mind (like Vivek or Jeremy) can reap the benefits of interacting with you, and the people who would be particularly hurt can avoid you.
If your warnings are not preventing this pattern of unexpected hurt, then you need to do better. You need to inform people to the point that they know what distribution they’re sampling from. If people know, I’m confident that they will start saying “no.” I probably would have said “no thanks” (or at least ducked out sooner and taken less damage), and Kurt would have said “no” as well.
If you don’t inform people to a sufficient extent, the community should (and, I think, will) hold you accountable for the unexpected costs you impose on others.
I would still be disturbed and uneasy for the reasons Jacob Steinhardt mentioned, including “In the face of real consequences, I think that Nate would better regulate his emotions and impose far fewer costs on people he interacts with.”
(I don’t know who strong disagree-voted the parent comment, but I’m interested in hearing what the disagreement is. I currently think the comment is straightforwardly correct and important.)
The 9-karma disagree-vote is mine. (Surprise!) I thought about writing a comment, and then thought, “Nah, I don’t feel like getting involved with this one; I’ll just leave a quick disagree-vote”, but if you’re actively soliciting, I’ll write the comment.
I’m wary of the consequences of trying to institute social norms to protect people from subjective emotional damage, because I think “the cure is worse than the disease.” I’d rather develop a thick skin and take responsibility for my own emotions (even though it hurts when some people are mean), because I fear that the alternative is (speaking uncharitably) a dystopia of psychological warfare masquerading as kindness in which people compete to shut down the expression of perspectives they don’t like by motivatedly getting (subjectively sincerely) offended.
Technically, I don’t disagree with “people should know what they’re getting into” being a desirable goal (all other things being equal), but I think it should be applied symmetrically, and it makes sense for me to strong-disagree-vote a comment that I don’t think is applying it symmetrically: it’s not fair if “fighty” people need to to make lengthy disclaimers about how their bluntness might hurt someone’s feelings (which is true), but “cooperative” people don’t need to make lengthy disclaimers about how their tone-policing might silence someone’s perspective (which is also true).
I don’t know Nate very well. There was an incident on Twitter and Less Wrong the other year where I got offended at how glib and smug he was being, despite how wrong he was about the philosophy of dolphins. But in retrospect, I think I was wrong to get offended. (I got downvoted to oblivion, and I deserved it.) I wish I had kept my cool—not because I personally approve of the communication style Nate was using, but because I think it was bad for my soul and the world to let myself get distracted by mere style when I could have shrugged it off and stayed focused on the substance.
It sounds to me like you parsed my statement “One obvious takeaway here is that I should give my list of warnings-about-working-with-me to anyone who asks to discuss their alignment ideas with me, rather than just researchers I’m starting a collaboration with.” as me saying something like “I hereby adopt the solemn responsibility of warning people in advance, in all cases”, whereas I was interpreting it as more like “here’s a next thing to try!”.
I agree it would have been better of me to give direct bulldozing-warnings explicitly to Vivek’s hires.
(One obvious takeaway here is that I should give my list of warnings-about-working-with-me to anyone who asks to discuss their alignment ideas with me, rather than just researchers I’m starting a collaboration with. Obvious in hindsight; sorry for not doing that in your case.)
I agree that this statement does not explicitly say whether you would make this a one-time change or a permanent one. However, the tone and phrasing—”Obvious in hindsight; sorry for not doing that in your case”—suggested that you had learned from the experience and are likely to apply this lesson going forward. The use of the word “obvious”—twice—indicates to me that you believed that warnings are a clear improvement.
Ultimately, Nate, you wrote it. But I read it, and I don’t really see the “one-time experiment” interpretation. It just doesn’t make sense to me that it was “obvious in hindsight” that you should… adopt this “next thing to try”..?
In the above, I did not intend “here’s a next thing to try!” to be read like “here’s my next one-time experiment!”, but rather like “here’s a thing to add to my list of plausible ways to avoid this error-mode in the future, as is a virtuous thing to attempt!” (by contrast with “I hereby adopt this as a solemn responsibility”, as I hypothesize you interpreted me instead).
Dumping recollections, on the model that you want more data here:
I intended it as a general thing to try going forward, in a “seems like a sensible thing to do” sort of way (rather than in a “adopting an obligation to ensure it definitely gets done” sort of way).
After sending the email, I visualized people reaching out to me and asking if i wanted to chat about alignment (as you had, and as feels like a reconizable Event in my mind), and visualized being like “sure but FYI if we’re gonna do the alignment chat then maybe read these notes first”, and ran through that in my head a few times, as is my method for adopting such triggers.
I then also wrote down a task to expand my old “flaws list” (which was a collection of handles that I used as a memory-aid for having the “ways this could suck” chat, which I had, to that point, been having only verbally) into a written document, which eventually became the communication handbook (there were other contributing factors to that process also).
An older and different trigger (of “you’re hiring someone to work with directly on alignment”) proceeded to fire when I hired Vivek (if memory serves), and (if memory serves) I went verbally through my flaws list.
Neither the new nor the old triggers fired in the case of Vivek hiring employees, as discussed elsewhere.
Thomas Kwa heard from a friend that I was drafting a handbook (chat logs say this occured on Nov 30); it was still in a form I wasn’t terribly pleased with and so I said the friend could share a redacted version that contained the parts that I was happier with and that felt more relevant.
Around Jan 8, in an unrelated situation, I found myself in a series of conversations where I sent around the handbook and made use of it. I pushed it closer to completion in Jan 8-10 (according to Google doc’s history).
The results of that series of interactions, and of Vivek’s team’s (lack of) use of the handbook caused me to update away from this method being all that helpful. In particular: nobody at any point invoked one of the affordances or asked for one of the alternative conversation modes (though those sorts of things did seem to help when I personally managed to notice building frustration and personally suggest that we switch modes (although lying on the ground—a friend’s suggestion—turned out to work better for others than switching to other conversation modes)). This caused me to downgrade (in my head) the importance of ensuring that people had access to those resources.
I think that at some point around then I shared the fuller guide with Vivek’s team, but I didn’t quickly detemine when from the chat logs. Sometime between Nov 30 and Feb 22, presumably.
It looks from my chat logs like I then finished the draft around Feb 22 (where I have a timestamp from me noting as much to a friend). I probably put it publicly on my website sometime around then (though I couldn’t easily find a timestamp), and shared it with Vivek’s team (if I hadn’t already).
The next two MIRI hires both mentioned to me that they’d read my communication handbook (and I did not anticipate spending a bunch of time with them, nevermind on technical research), so they both didn’t trigger my “warn them” events and (for better or worse) I had them mentally filed away as “has seen the affordances list and the failure modes section”.
I appreciate the detail, thanks. In particular, I had wrongly assumed that the handbook had been written much earlier, such that even Vivek could have been shown it before deciding to work with you. This also makes more sense of your comments that “writing the handbook” was indicative of effort on your part, since our July interaction.
Overall, I retain my very serious concerns, which I will clarify in another comment, but am more in agreement with claims like “Nate has put in effort of some kind since the July chat.”
The next two MIRI hires both mentioned to me that they’d read my communication handbook
Noting that at least one of them read the handbook because I warned them and told them to go ask around about interacting with you, to make sure they knew what they were getting into.
Do I have your permission to quote the relevant portion of your email to me?
Yep! I’ve also just reproduced it here, for convenience:
(One obvious takeaway here is that I should give my list of warnings-about-working-with-me to anyone who asks to discuss their alignment ideas with me, rather than just researchers I’m starting a collaboration with. Obvious in hindsight; sorry for not doing that in your case.)
I hereby push back against the (implicit) narrative that I find the standard community norms costly, or that my communication protocols are “alternative”.
My model is closer to: the world is a big place these days, different people run on different conversation norms. The conversation difficulties look, to me, symmetric, with each party violating norms that the other considers basic, and failing to demonstrate virtues that the other considers table-stakes.
(To be clear, I consider myself to bear an asymmetric burden of responsibility for the conversatiosn going well, according to my seniority, which is why I issue apologies instead of critiques when things go off the rails.)
Separately but relatedly: I think the failure-mode I had with Vivek & co was rather different than the failure-mode I had with you. In short: in your case, I think the issue was rooted in a conversational dynamic that caused me frustration, whereas in Vivek & co’s case, I think the issue was rooted in a conversational dynamic that caused me despair.
Which is not to say that the issues are wholly independent; my guess is that the common-cause is something like “some people take a lot of damage from having conversations with someone who despairs of the conversation”.
Tying this back: my current model of the situation is not that I’m violating community norms about how to have a conversation while visibly hopeless, but am rather in uncharted territory by trying to have those conversations at all.
(For instance: standard academia norms as I understand them are to lie to yourself and/or others about how much hope you have in something, and/or swallow enough of the modesty-pill that you start seeing hope in places I would not, so as to sidestep the issue altogether. Which I’m not personally up for.)
([tone: joking but with a fragment of truth] …I guess that the other norm in academia when academics are hopless about others’ research is “have feuds”, which… well we seem to be doing a fine job by comparison to the standard norms, here!)
Where, to be clear, I already mostly avoid conversations where I’m hopeless! I’m mostly a hermit! The obvious fix of “speak to fewer people” is already being applied!
And beyond that, I’m putting in rather a lot of work (with things like my communication handbook) to making my own norms clearer, and I follow what I think are good meta-norms of being very open to trying other people’s alternative conversational formats.
I’m happy to debate what the local norms should be, and to acknowledge my own conversational mistakes (of which I have made plenty), but I sure don’t buy a narrative that I’m in violation of the local norms.
(But perhaps I will if everyone in the comments shouts me down! Local norms are precisely the sort of thing that I can learn about by everyone shouting me down about this!)
This is preposterous.
I’m not going to discuss specific norms. Discussing norms with Nate leads to an explosion of conversational complexity.[1] In my opinion, such discussion can sound really nice and reasonable, until you remember that you just wanted him to e.g. not insult your reasoning skills and instead engage with your object-level claims… but somehow your simple request turns into a complicated and painful negotiation. You never thought you’d have to explain “being nice.”
Then—in my experience—you give up trying to negotiate anything from him and just accept that he gets to follow whatever “norms” he wants.
So, in order to evaluate whether Nate is “following social norms”, let’s not think about the norms themselves. I’m instead going to share some more of the interactions Nate has had:
“Flipping out” at Kurt Brown because there wasn’t enough sourdough bread.
Storming out of the room because Kurt had a question about pumping Nate’s tires.
A chat with me, where Nate describes himself as “visibly flustered, visibly frustrated, had a raised voice, and was being mean in various of his replies.”
An employee reporting fear and dread at the prospect of meeting with Nate,
People crying while talking with Nate about research,
ETA: Peter Barnett writes “I think it’s kind of hard to see how bad/annoying/sad this is until you’re in it.”
(There are people who want to speak out, but have not yet. I am not at liberty to detail their cases myself.)
EDIT: Nate did not warn most of these people beforehand. It seems to me that Nate is still balking at requests to warn people.
See the second half of Nate’s response to Akash, starting with “These sound more to me like”.
Regarding my own experience: I would destroy 10% of my liquidity to erase from my life that conversation and its emotional effects.
I don’t think it’s reasonable to expect Nate to have predicted that I in particular would be hurt so much. But in fact, being unexpectedly and nonconsensually mean and aggressive and hurtful has heavy negative tails.
And so statistically, this level of harm is totally predictable, both on priors and based off of past experience which I know Nate has.
This seems like a fairly extreme statement, so I was about to upvote due to the courage required to post it publicly and stand behind it. But then I stopped and thought about the long term effects and it’s probably best not to encourage this.
As ideally, you, along with the vast majority of potential readers, should become less emotionally reactive over time to any real or perceived insults, slights, etc...
If it’s the heat of the moment talking, that’s fine, but letting thoughts of payback, revenge, etc., linger on for days afterwards likely will not lead to any positive outcome.
I have had these thoughts many times. I would berate myself for letting it get on my nerves so much. It was just an hour-and-a-half chat. But I don’t think it’s a matter of “letting” thoughts occur, or not. Certain situations are damaging to certain people, and this situation isn’t a matter of whether people are encouraged to be damaged or not (I certainly had no expectation of writing about this, back in July–October 2022.)
EDIT: Moving another part elsewhere.
I upvoted for effort because it’s clear you put in quite a bit of effort in writing this comment, but skipped expressing agreement or disagreement.
I had thought of several possible responses, and it is worthy of a substantial response, but since it’s not my role to be the adjudicator or corrector of LW users, I’ll pose you this question:
Consider, is it possible for him to take offence in return and then retaliate via some mean(s)? If that does occur, what’s the range of likely outcomes?
(Note that I moved most of the original comment and plan to put it elsewhere in the thread.)
I don’t follow. I’m not going to behave differently in the face of any possible retaliation, nor do I in fact expect Nate to retaliate in an inappropriate manner. So I’m not worried about this?
It seems weird to single out this specific type of human limitation (compared to perfect-robot instrumental rationality) over the hundreds of others. If someone isn’t in top physical shape or cannot drive cars under difficult circumstances or didn’t renew their glasses and therefore doesn’t see optimally, would you also be reluctant to upvote comments you were otherwise tempted to upvote (where they bravely disclose some limitation) because of this worry about poor incentives? “Ideally,” in a world where there’s infinite time so there are no tradeoffs for spending self-improvement energy, rationalists would all be in shape, have brushed up their driving skills, have their glasses updated, etc. In reality, it’s perfectly fine/rational to deprioritize many things that are “good to have” because other issues are more pressing, more immediately deserving of self-improvement energy. (Not to mention that rationality for its own sake is lame anyway and so many of us actually want to do object-level work towards a better future.) What to best focus on with self-improvement energy will differ a lot from person to person, not only because people have different strengths and weaknesses, but also because they operate in different environments. (E.g., in some environments, one has to deal with rude people all the time, whereas in others, this may be a rare occurrence.) For all these reasons, it seems weirdly patronizing to try to shape other people’s prioritization for investing self-improvement energy. This isn’t to say that this site/community shouldn’t have norms and corresponding virtues and vices. Since LW is about truth-seeking, it makes sense to promote virtues directly related to truth-seeking, e.g., by downvoting comments that exhibit poor epistemic practices. However, my point is that even though it might be tempting to discourage not just poor epistemic rationality but also poor instrumental rationality, these two work very differently, especially as far as optimal incentive-setting is concerned. Epistemic rationality is an ideal we can more easily enforce and get closer towards. Instrumental rationality, by contrast, is a giant jungle that people are coming into from all kinds of different directions. “Having unusually distracting emotional reactions to situations xyz” is one example of suboptimal instrumental rationality, but so is “being in poor physical shape,”or “not being able to drive a car,” or “not having your glasses updated,” etc. I don’t think it makes sense for the community to create a hierarchy of “most important facets of instrumental rationality” that’s supposed to apply equally to all kinds of people. (Instead, I think it makes more sense to reward meta-skills of instrumental rationality, such as “try to figure out what your biggest problems are and really prioritize working on them.) (If we want to pass direct judgment on someone’s prioritization of self-improvement energy, we need to know their exact situation and goals and the limitations they have, how good they are at learning various things, etc.) Not to mention the unwelcoming effects when people get judged for limitations of instrumental rationality that the community for some reason perceives to be particularly bad. Such things are always more personal (and therefore more unfair) than judging someone for having made a clear error of reasoning (epistemic rationality).
(I say all of this as though it’s indeed “very uncommon” to feel strongly hurt and lastingly affected by particularly harsh criticism. I don’t even necessarily think that this is the case: If the criticism comes from a person with high standing in a community one cares about, it seems like a potentially quite common reaction?)
This is relevant context for my strong reaction. I used to admire Nate, and so I was particularly upset when he treated me disrespectfully. (The experience wasn’t so much “criticism” as “aggression and meanness”, though.)
FWIW, I also reject the framing that this situation is reasonably understood as an issue with my own instrumental rationality.
Going back to the broader point about incentives, it’s not very rewarding to publicly share a distressing experience and thereby allow thousands of internet strangers to judge my fortitude, and complain if they think it lacking. I’m not walking away from this experience feeling lavished and reinforced for having experienced an emotional reaction.
Furthermore, the reason I spoke up was mostly not to litigate my own experience. It’s because I’ve spent months witnessing my friends take unexpected damage from a powerful individual who appears to have faced basically no consequences for his behavior.
This is a minor error but I feel the need to correct it for future readers, as it’s in the first sentence. There are infinitely many ‘specific types’ of human limitations, or at least an uncountable quantity , depending on the reader’s preferred epistemology.
The rest of your thesis is interesting though a bit difficult to parse. Could you isolate a few of the key points and present them in a list?
I wasn’t the one who downvoted your reply (seems fair to ask for clarifications), but I don’t want to spend much more time on this and writing summaries isn’t my strength. Here’s a crude attempt at saying the same thing in fewer and different words:
IMO, there’s nothing particularly “antithetical to LW aims/LW culture” (edit: “antithetical to LW aims/LW culture” is not a direct quote by anyone; but it’s my summary interpretation of why you might be concerned about bad incentives in this case) about neuroticism-related “shortcomings.” “Shortcomings” compared to a robotic ideal of perfect instrumental rationality. By “neuroticism-related “shortcomings”″, I mean things like having triggers or being unusually affected by harsh criticism. It’s therefore weird and a bit unfair to single out such neuroticism-related “shortcomings” over things like “being in bad shape” or “not being good at common life skills like driving a car.” (I’m guessing that you wouldn’t be similarly concerned about setting bad incentives if someone admitted that they were bad at driving cars or weren’t in the best shape.) I’m only guessing here, but I wonder about rationalist signalling cascades about the virtues of rationality, where it gets rewarded to be particularly critical about things that least correspond to the image of what an ideally rational robot would be like. However, in reality, applied rationality isn’t about getting close to some ideal image. Instead, it’s about making the best out of what you have, taking the best next move step-by-step for your specific situation, always prioritizing what actually gets you to your goals rather than prioritizing “how do I look as though I’m very rational.”
Not to mention that high emotionality confers advantages in many situations and isn’t just an all-out negative. (See also TurnTrout’s comment about rejecting the framing that this is an issue of his instrumental rationality being at fault.)
I don’t mind the occasional downvote or negative karma, it even has some positive benefits, such as being a useful signalling function. As it’s decent evidence I haven’t tailored my comments for popularity or platitudes.
In regards to your points, I’ll only try to respond to them one at a time, since this is already pretty far down the comment chain.
Who suggested that there was a relation between being “antithetical to LW aims/LW culture” and “neuroticism-related “shortcomings”″?
i.e. Is it supposed to be my idea, TurnTrout’s, your’s, a general sentiment, something from the collective unconscious, etc.?
I made an edit to my above comment to address your question; it’s probably confusing that I used quotation marks for something that wasn’t a direct quote by anyone.
I appreciate the edit though can you clarify why you put so many quotes in when they are your own thoughts?
Is it just an idiosyncratic writing style or is it also meant to convey some emotion, context, direction, etc.?
But to clarify, this is not the reason why I ‘might be concerned about bad incentives in this case’, if you were wondering.
Not Lukas, but I also sometimes use quotes:
as a kind of semantic brackets; I think the official way to do this is to write-the-words-connected-by-hyphens, but that just seems hard to read;
to remove a possible connotation, i.e. to signal that I am using the word not exactly as most people would probably use it in a similar situation;
or as a combination of both, something like: I am using these words to express an idea, but these are probably not the right words, but I can’t find any better, so please do not take this part literally and don’t start nitpicking (don’t assume that I used a specific word because I wanted to hint at something specific).
For example, as I understand it,
Means: things that technically are shortcomings (because they deviate from some ideal), but also a reasonable person wouldn’t call them so (because it is a normal human behavior, and I would actually be very suspicious about anyone who claimed to not have any of them), so the word is kinda correct but also kinda incorrect. But it is a way to express what I mean.
Sounds like I misinterpreted the motivation behind your original comment!
I ran out of energy to continue this thread/conversation, but feel free to clarify what you meant for others (if you think it isn’t already clear enough for most readers).
(For completeness, I want to note that I’ve talked with a range of former/current MIRI employees, and a non-trivial fraction did have basically fine interactions with Nate.)
More detail here seems like it could be good. What form did the insult take? Other relevant context?
According to my notes and emails, Nate repeatedly said things like “I have not yet ruled out [uncharitable hypothesis about how TurnTrout is reasoning]” in order to—according to him—accomplish his conversational objectives / because his “polite” statements were apparently not getting his point across. I don’t remember what specific uncharitable things he said (the chat was on 7/19/22).
I might come back and add more context later.
Thank you.
Huh, I initially found myself surprised that Nate thinks he’s adhering to community norms. I wonder if part of what’s going on here is that “community norms” is a pretty vague phrase that people can interpret differently.
Epistemic status: Speculative. I haven’t had many interactions with Nate, so I’m mostly going off of what I’ve heard from others + general vibes.
Some specific norms that I imagine Nate is adhering to (or exceeding expectations in):
Honesty
Meta-honesty
Trying to offer concrete models and predictions
Being (internally) open to acknowledging and recognizing mistakes, saying oops, etc.
Some specific norms that I think Nate might not be adhering to:
Engaging with people in ways such that they often feel heard/seen/understood
Engaging with people in ways such that they rarely feel dismissed/disrespected
Something fuzzy that lots of people would call “kindness” or “typical levels of warmth”
I’m guessing that some people think that social norms dictate something like “you are supposed to be kind and civil and avoid making people unnecessarily sad/insecure/defensive.” I wonder if Nate (a) believes that these are community norms and thinks he’s following them or (b) just doesn’t think these are community norms in the first place.
I think this explains some of the effect, but not all of it. In academia, for instance, I think there are plenty of conversations in which two researchers (a) disagree a ton, (b) think the other person’s work is hopeless or confused in deep ways, (c) honestly express the nature of their disagreement, but (d) do so in a way where people generally feel respected/valued when talking to them.
Like, it’s certainly easier to make people feel heard/seen if you agree with them a bunch and say their ideas are awesome, but of course that would be dishonest [for Nate].
So I could see a world where Nate is like “Darn, the reason people seem to find communicating with me to be difficult is that I’m just presenting them with the harsh truths, and it is indeed hard for people to hear harsh truths.”
But I think some people possess the skill of “being able to communicate harsh truths accurately in ways where people still find the interaction kind, graceful, respectful, and constructive.” And my understanding is that’s what people like TurnTrout are wishing for.
This is not a reasonable norm. In some circumstances (including, it sounds like, some of the conversations under discussion) meeting this standard would require a large amount of additional effort, not related to the ostensible reason for talking in the first place.
Again, a pretty unreasonable norm. For some topics, such as “is what you’re doing actually making progress towards that thing you’ve arranged your life (including social context) around making progress on?”, it’s very easy for people to feel this way, even if they are being told true, useful, relevant things.
Ditto, though significantly less strongly; I do think there’s ways to do this that stay honest and on-mission without too much tradeoff.
I think it’s not a reasonable norm to make sure your interlocutors never e.g. feel dismissed/disrespected, but it is reasonable to take some measures to avoid having someone consistently feel dismissed/disrespected if you spend over 200 hours talking with their team and loosely mentoring them (which to be clear Nate did, it’s just difficult in his position and so was only mildly successful).
I’m not sure kindness/warmth should even be a norm because it’s pretty difficult to define.
The details matter here; I don’t feel I can guess from what you’ve said whether we’d agree or not.
For example:
Tam: says some idea about alignment
Newt: says some particular flaw ”...and this is an instance of a general problem, which you’ll have to address if you want to make progress...” gestures a bit at the general problem
Tam: makes a tweak to the proposal that locally addresses the particular flaw
Newt: “This still doesn’t address the problem.”
Tam: “But it seems to solve the concrete problem, at least as you stated it. It’s not obvious to me that there’s a general problem here; if we can solve instances of it case-by-case, that seems like a lot of progress.”
Newt: “Look, we could play this game for some more rounds, where you add more gears and boxes to make it harder to see that there’s a problem that isn’t being addressed at all, and maybe after a few rounds you’ll get the point. But can we just skip ahead to you generalizing to the class of problem, or at least trying to do that on your own?”
Tam: feels dismissed/disrespected
I think Newt could have been more graceful and more helpful, e.g. explicitly stating that he’s had a history of conversations like this, and setting boundaries about how much effort he feels exciting about putting in, and using body language that is non-conflictual… But even if he doesn’t do that, I don’t really think he’s violating a norm here. And depending on context this sort of behavior might be about as well as Newt can do for now.
You can choose to ignore all these “unreasonable norms”, but they still have consequences. Such as people thinking you are an asshole. Or leaving the organization because of you. It is easy to underestimate these costs, because most of the time people won’t tell you (or they will, but you will ignore them and quickly forget).
This is a cost that people working with Nate should not ignore, even if Nate does.
I see three options:
try making Nate change—this may not be possible, but I think it’s worth trying;
isolate Nate from… well, everyone else, except for volunteers who were explicitly warned;
hire a separate person whose full time job will be to make Nate happy.
Anything else, I am afraid, will mean paying the costs and most likely being in denial about them.
I see at least two other options (which, ideally, should be used in tandem):
don’t hire people who are so terribly sensitive to above-average blutness
hire managers who will take care of ops/personnel problems more effectively, thus reducing the necessity for researchers to navigate interpersonal situations that arise from such problems
If I translate it mentally to “don’t hire people from the bottom 99% of thick skin”, I actually agree. Though they may be difficult to find, especially in combination with other requirements.
Are you available for the job? ;-)
Do you really think it’d take 99th percentile skin-thickness to deal with this sort of thing without having some sort of emotional breakdown? This seems to me to be an extraordinary claim.
While I probably qualify in this regard, I don’t think that I have any other relevant qualifications.
My experience is that people who I think of as having at least 90th percentile (and probably 99th if I think about it harder) thick-skin have been brought to tears from an intense conversation with Nate.
My guess is that this wouldn’t happen for a lot of possible employees from the broader economy, and this isn’t because they’ve got thicker skin, but it’s because they’re not very emotionally invested in the organization’s work, and generally don’t bring themselves to their work enough to risk this level of emotion/hurt.
This is a truly extraordinary claim! I don’t know what evidence I’d need to see in order to believe it, but whatever that evidence is, I sure haven’t seen it yet.
This just can’t be right. I’ve met a decent number of people who are very invested in their work and the mission of whatever organization they’re part of, and I can’t imagine them being brought to tears by “an intense conversation” with one of their co-workers (nor have I heard of such a thing happening to the people I have in mind).
Something else is going on here, it seems to me; and the most obvious candidate for what that “something else” might be is simply that your view of what the distribution of “thick-skinned-ness” is like, is very mis-calibrated.
To me the obvious candidate is that people are orienting around Nate in particular in an especially weird way.
(Don’t know why some folks have downvoted the above comment, seems like a totally normal epistemic state for Person A not to believe what Person B believes about something after simply learning that Person B believes it, and to think Person B is likely miscalibrated. I have strong upvoted the comment back to clearly positive.)
My model says that this requires them to still be hopeful about local communication progress, and happens when they disagree but already share a lot of frames and concepts and background knowledge. I, at least, find it much harder when I don’t expect the communciation attempt to make progress, or have positive effect.
(“Then why have the conversation at all?” I mostly don’t! But sometimes I mispredict how much hope I’ll have, or try out some new idea that doesn’t work, or get badgered into it.)
These sound more to me like personality traits (that members of the local culture generally consider virtuous) than communication norms.
On my model, communciation norms are much lover-level than this. Basics of rationalist discourse seem closer; archaic politeness norms (“always refuse food thrice before accepting”) are an example of even lower-level stuff.
My model, speaking roughly and summarizing a bunch, says that the lowest-level stuff (atop a background of liberal-ish internet culture and basic rationalist discourse) isn’t pinned down on account of cultural diversity, so we substitute with meta-norms, which (as best I understand them) include things like “if your convo-partner requests a particular conversation-style, either try it out or voice objections or suggest alternatives” and “if things aren’t working, retreat to a protected meta discussion and build a shared understanding of the issue and cooperatively address it”.
I acknowledge that this can be pretty difficult to do on the fly, especially if emotions are riding high. (And I think we have cultural diversity around whether emotions are ever supposed to ride high, and if so, under what circumstances.) On my model of local norms, this sort of thing gets filed under “yep, communicating in the modern world can be rocky; if something goes wrong then you go meta and try to figure out the causes and do something differently next time”. (Which often doesn’t work! In which case you iterate, while also shifting your conversational attention elsewhere.)
To be clear, I buy a claim of the form “gosh, you (Nate) seem to run on a relatively rarer native emotional protocol, for this neck of the woods”. My model is that local norms are sufficiently flexible to continue “and we resolve that by experimentation and occasional meta”.
And for the record, I’m pretty happy to litigate specific interactions. When it comes to low-level norms, I think there are a bunch of conversational moves that others think are benign that I see as jabs (and which I often endorse jabbing back against, depending on the ongoing conversation style), and a bunch of conversational moves that I see as benign that others take as jabs, and I’m both (a) happy to explicate the things that felt to me like jabs; (b) happy to learn what other people took as jabs; and (c) happy to try alternative communication styles where we’re jabbing each other less. Where this openness-to-meta-and-trying-alternative-things seems like the key local meta-norm, at least in my understanding of local culture.
It seems to me that in theory it should be possible to have very unusual norms and make it work, but that in practice you and your organization horribly underestimate how difficult it is to communicate such things clearly (more than once, because people forget or don’t realize the full implications at the first time). You assume that the local norms were made perfectly clear, but they were not (expecting short inferential distances, double illusion of transparency, etc.).
Did you expect KurtB to have this kind of reaction, to post this kind of comment, and to get upvoted? If the answer is no, it means your model is wrong somewhere.
(If the answer is yes, maybe you should print that comment, and give a copy to all new employees. That might dramatically reduce a possibility of misunderstanding.)
My original comment is not talking about communication norms. It’s talking about “social norms” and “communication protocols” within those norms. I mentioned “basic respectfulness and professionalism.”
This is a thing, but I’m guessing that what you have in mind involves a lot more than you’re crediting of not actually trying for the crux of the conversation. As just one example, you can be “more respectful” by making fewer “sweeping claims” such as “you are making such and such error in reasoning throughout this discussion / topic / whatever”. But that’s a pretty important thing to be able to say, if you’re trying to get to real cruxes and address despair and so on.
Kinda. I’m advocating less for the skill of “be graceful and respectful and constructive” and instead looking at the lower bar of “don’t be overtly rude and aggressive without consent; employ (something within 2 standard deviations of) standard professional courtesy; else social consequences.” I want to be clear that I’m not wishing for some kind of subtle mastery, here.
Nate, I am skeptical.
As best I can fathom, you put in very little work to proactively warn new hires about the emotional damage which your employees often experience. I’ve talked to a range of people who have had professional interactions with you, both recently and further back. Only one of the recent cases reported that you warned them before they started working with you.
In particular, talking to the hires themselves, I have detected no evidence that you have proactively warned most of the hires[1] you’ve started working with since July 2022, which is when:
I told you that your anger and ranting imposed unexpected and large costs on me,
And you responded with something like “Sorry, I’ll make sure to tell people I have research conversations with—instead of just my formal collaborators. Obvious in hindsight.”
And yet you apparently repeatedly did not warn most of your onboarded collaborators.
EDIT: The original version of this comment claimed “None were warned.” This was accurate reporting at the time of my comment. However, I now believe that Nate did in fact proactively warn Vivek, at least a bit and to some extent. I am overall still worried, as I know of several specific cases which lacked sufficient warning, and some of them paid surprisingly high costs because of it.
Vivek did not recall any warning but thought it possible that you had verbally mentioned some cons, which he forgot about. However, now that Nate jogged his memory a bit, he thinks he probably did receive a warning.
I personally wouldn’t count a forgettable set of remarks as “sufficient warning”, given the track record here. It sure seems to me like it should have been memorable.
On the facts: I’m pretty sure I took Vivek aside and gave a big list of reasons why I thought working with me might suck, and listed that there are cases where I get real frustrated as one of them. (Not sure whether you count him as “recent”.)
My recollection is that he probed a little and was like “I’m not too worried about that” and didn’t probe further. My recollection is also that he was correct in this; the issues I had working with Vivek’s team were not based in the same failure mode I had with you; I don’t recall instances of me getting frustrated and bulldozey (though I suppose I could have forgotten them).
(Perhaps that’s an important point? I could imagine being significantly more worried about my behavior here if you thought that most of my convos with Vivek’s team were like most of my convos with you. I think if an onlooker was describing my convo with you they’d be like “Nate was visibly flustered, visibly frustrated, had a raised voice, and was being mean in various of his replies.” I think if an onlooker was describing my convos with Vivek’s team they’d be like “he seemed sad and pained, was talking quietly and as if choosing the right words was a struggle, and would often talk about seemingly-unrelated subjects or talk in annoying parables, while giving off a sense that he didn’t really expect any of this to work”. I think that both can suck! And both are related by a common root of “Nate conversed while having strong emotions”. But, on the object level, I think I was in fact avoiding the errors I made in conversation with you, in conversation with them.)
As to the issue of not passing on my “working with Nate can suck” notes, I think there are a handful of things going on here, including the context here and, more relevantly, the fact that sharing notes just didn’t seem to do all that much in practice.
I could say more about that; the short version is that I think “have the conversation while they’re standing, and I’m lying on the floor and wearing a funny hat” seems to work empirically better, and...
hmm, I think part of the issue here is that I was thinking like “sharing warnings and notes is a hypothesis, to test among other hypotheses like lying on the floor and wearing a funny hat; I’ll try various hypotheses out and keep doing what seems to work”, whereas (I suspect) you’re more like “regardless of what makes the conversations go visibly better, you are obligated to issue warnings, as is an important part of emotionally-bracing your conversation partners; this is socially important if it doesn’t seem to change the conversation outcomes”.
I think I’d be more compelled by this argument if I was having ongoing issues with bulldozing (in the sense of the convo we had), as opposed to my current issue where some people report distress when I talk with them while having emotions like despair/hoplessness.
I think I’d also be more compelled by this argument if I was more sold on warnings being the sort of thing that works in practice.
Like… (to take a recent example) if I’m walking by a whiteboard in rosegarden inn, and two people are like “hey Nate can you weigh in on this object-level question”, I don’t… really believe that saying “first, be warned that talking techincal things with me can leave you exposed to unshielded negative-valence emotions (frustration, despair, …), which some people find pretty crappy; do you still want me to weigh in?” actually does much. I am skeptical that people say “nope” to that in practice.
I suppose that perhaps what it does is make people feel better if, in fact, it happens? And maybe I’ll try it a bit and see? But I don’t want to sound like I’m promising to do such a thing reliably even as it starts to feel useless to me, as opposed to experimenting and gravitating towards things that seem to work better like “offer to lie on the floor while wearing a funny hat if I notice things getting heated”.
I’ve been asked to clarify a point of fact, so I’ll do so here:
This does ring a bell, and my brain is weakly telling me it did happen on a walk with Nate, but it’s so fuzzy that I can’t tell if it’s a real memory or not. A confounder here is that I’ve probably also had the conversational route “MIRI burnout is a thing, yikes” → “I’m not too worried, I’m a robust and upbeat person” multiple times with people other than Nate.
In private correspondence, Nate seems to remember some actual details, and I trust that he is accurately reporting his beliefs. So I’d mostly defer to him on questions of fact here.
I’m pretty sure I’m the person mentioned in TurnTrout’s footnote. I confirm that, at the time he asked me, I had no recollection of being “warned” by Nate but thought it very plausible that I’d forgotten.
This is a slight positive update for me. I maintain my overall worry and critique: chats which are forgettable do not constitute sufficient warning.
Insofar as non-Nate MIRI personnel thoroughly warned Vivek, that is another slight positive update, since this warning should reliably be encountered by potential hires. If Vivek was independently warned via random social connections not possessed by everyone,[1] then that’s a slight negative update.
For example, Thomas Kwa learned about Nate’s comm doc by randomly talking with a close friend of Nate’s, and mentioning comm difficulties.
I think there are several critical issues with your behavior, but I think the most urgent is that people often don’t know what they’re getting into. People have a right to make informed decisions and to not have large, unexpected costs shunted onto them.
It’s true that no one has to talk with you. But it’s often not true that people know what they’re getting into. I spoke out publicly because I encountered a pattern, among my friends and colleagues, of people taking large and unexpected emotional damage from interacting with you.
If our July interaction had been an isolated incident, I still would have been quite upset with you, but I would not have been outraged.
If the pattern I encountered were more like “a bunch of people report high costs imposed by Nate, but basically in the ways they expected”, I’d be somewhat less outraged.[1] If people can accurately predict the costs and make informed decisions, then people who don’t mind (like Vivek or Jeremy) can reap the benefits of interacting with you, and the people who would be particularly hurt can avoid you.
If your warnings are not preventing this pattern of unexpected hurt, then you need to do better. You need to inform people to the point that they know what distribution they’re sampling from. If people know, I’m confident that they will start saying “no.” I probably would have said “no thanks” (or at least ducked out sooner and taken less damage), and Kurt would have said “no” as well.
If you don’t inform people to a sufficient extent, the community should (and, I think, will) hold you accountable for the unexpected costs you impose on others.
I would still be disturbed and uneasy for the reasons Jacob Steinhardt mentioned, including “In the face of real consequences, I think that Nate would better regulate his emotions and impose far fewer costs on people he interacts with.”
(I don’t know who strong disagree-voted the parent comment, but I’m interested in hearing what the disagreement is. I currently think the comment is straightforwardly correct and important.)
The 9-karma disagree-vote is mine. (Surprise!) I thought about writing a comment, and then thought, “Nah, I don’t feel like getting involved with this one; I’ll just leave a quick disagree-vote”, but if you’re actively soliciting, I’ll write the comment.
I’m wary of the consequences of trying to institute social norms to protect people from subjective emotional damage, because I think “the cure is worse than the disease.” I’d rather develop a thick skin and take responsibility for my own emotions (even though it hurts when some people are mean), because I fear that the alternative is (speaking uncharitably) a dystopia of psychological warfare masquerading as kindness in which people compete to shut down the expression of perspectives they don’t like by motivatedly getting (subjectively sincerely) offended.
Technically, I don’t disagree with “people should know what they’re getting into” being a desirable goal (all other things being equal), but I think it should be applied symmetrically, and it makes sense for me to strong-disagree-vote a comment that I don’t think is applying it symmetrically: it’s not fair if “fighty” people need to to make lengthy disclaimers about how their bluntness might hurt someone’s feelings (which is true), but “cooperative” people don’t need to make lengthy disclaimers about how their tone-policing might silence someone’s perspective (which is also true).
I don’t know Nate very well. There was an incident on Twitter and Less Wrong the other year where I got offended at how glib and smug he was being, despite how wrong he was about the philosophy of dolphins. But in retrospect, I think I was wrong to get offended. (I got downvoted to oblivion, and I deserved it.) I wish I had kept my cool—not because I personally approve of the communication style Nate was using, but because I think it was bad for my soul and the world to let myself get distracted by mere style when I could have shrugged it off and stayed focused on the substance.
You told me you would warn people, and then did not.[1]
Do I have your permission to quote the relevant portion of your email to me?
I warned the immediately-next person.
It sounds to me like you parsed my statement “One obvious takeaway here is that I should give my list of warnings-about-working-with-me to anyone who asks to discuss their alignment ideas with me, rather than just researchers I’m starting a collaboration with.” as me saying something like “I hereby adopt the solemn responsibility of warning people in advance, in all cases”, whereas I was interpreting it as more like “here’s a next thing to try!”.
I agree it would have been better of me to give direct bulldozing-warnings explicitly to Vivek’s hires.
Here is the statement:
I agree that this statement does not explicitly say whether you would make this a one-time change or a permanent one. However, the tone and phrasing—”Obvious in hindsight; sorry for not doing that in your case”—suggested that you had learned from the experience and are likely to apply this lesson going forward. The use of the word “obvious”—twice—indicates to me that you believed that warnings are a clear improvement.
Ultimately, Nate, you wrote it. But I read it, and I don’t really see the “one-time experiment” interpretation. It just doesn’t make sense to me that it was “obvious in hindsight” that you should… adopt this “next thing to try”..?
I did not intend it as a one-time experiment.
In the above, I did not intend “here’s a next thing to try!” to be read like “here’s my next one-time experiment!”, but rather like “here’s a thing to add to my list of plausible ways to avoid this error-mode in the future, as is a virtuous thing to attempt!” (by contrast with “I hereby adopt this as a solemn responsibility”, as I hypothesize you interpreted me instead).
Dumping recollections, on the model that you want more data here:
I intended it as a general thing to try going forward, in a “seems like a sensible thing to do” sort of way (rather than in a “adopting an obligation to ensure it definitely gets done” sort of way).
After sending the email, I visualized people reaching out to me and asking if i wanted to chat about alignment (as you had, and as feels like a reconizable Event in my mind), and visualized being like “sure but FYI if we’re gonna do the alignment chat then maybe read these notes first”, and ran through that in my head a few times, as is my method for adopting such triggers.
I then also wrote down a task to expand my old “flaws list” (which was a collection of handles that I used as a memory-aid for having the “ways this could suck” chat, which I had, to that point, been having only verbally) into a written document, which eventually became the communication handbook (there were other contributing factors to that process also).
An older and different trigger (of “you’re hiring someone to work with directly on alignment”) proceeded to fire when I hired Vivek (if memory serves), and (if memory serves) I went verbally through my flaws list.
Neither the new nor the old triggers fired in the case of Vivek hiring employees, as discussed elsewhere.
Thomas Kwa heard from a friend that I was drafting a handbook (chat logs say this occured on Nov 30); it was still in a form I wasn’t terribly pleased with and so I said the friend could share a redacted version that contained the parts that I was happier with and that felt more relevant.
Around Jan 8, in an unrelated situation, I found myself in a series of conversations where I sent around the handbook and made use of it. I pushed it closer to completion in Jan 8-10 (according to Google doc’s history).
The results of that series of interactions, and of Vivek’s team’s (lack of) use of the handbook caused me to update away from this method being all that helpful. In particular: nobody at any point invoked one of the affordances or asked for one of the alternative conversation modes (though those sorts of things did seem to help when I personally managed to notice building frustration and personally suggest that we switch modes (although lying on the ground—a friend’s suggestion—turned out to work better for others than switching to other conversation modes)). This caused me to downgrade (in my head) the importance of ensuring that people had access to those resources.
I think that at some point around then I shared the fuller guide with Vivek’s team, but I didn’t quickly detemine when from the chat logs. Sometime between Nov 30 and Feb 22, presumably.
It looks from my chat logs like I then finished the draft around Feb 22 (where I have a timestamp from me noting as much to a friend). I probably put it publicly on my website sometime around then (though I couldn’t easily find a timestamp), and shared it with Vivek’s team (if I hadn’t already).
The next two MIRI hires both mentioned to me that they’d read my communication handbook (and I did not anticipate spending a bunch of time with them, nevermind on technical research), so they both didn’t trigger my “warn them” events and (for better or worse) I had them mentally filed away as “has seen the affordances list and the failure modes section”.
I appreciate the detail, thanks. In particular, I had wrongly assumed that the handbook had been written much earlier, such that even Vivek could have been shown it before deciding to work with you. This also makes more sense of your comments that “writing the handbook” was indicative of effort on your part, since our July interaction.
Overall, I retain my very serious concerns, which I will clarify in another comment, but am more in agreement with claims like “Nate has put in effort of some kind since the July chat.”
Noting that at least one of them read the handbook because I warned them and told them to go ask around about interacting with you, to make sure they knew what they were getting into.
Yep! I’ve also just reproduced it here, for convenience: