To be slightly more precise, I think I historically felt like I identify with like 60% of framings in the general MIRI cluster(at least the way it appears in public outputs) and now I’m like 80%+, and part of the difference here was that I already was pretty into stuff like empiricism, materalism, Bayesianism, etc, but I previously (not very reflectively) had opinions and intuitions in the direction of thinking myself as an computational instance, and these days I can understand the algorithmic framing much better (even though it’s still not very intuitive/natural to me).(Numbers made up and not well thought out)
This sounds right to me. FDT feels more natural when I think of myself as an algorithm than when I think of myself as a computation, for example.
To clarify, are you saying that CFAR staff retreats don’t involve circling?
I’m actually pretty surprised by this, the people I personally know in academia who aren’t community members tend to a) be true believers about their impact or b) really love the problems they work on or their subfields or c) feel kind of burned. Liking academia for work-life balance reasons seem very surprising to me, even my friends in fields with a fair amount of free time (eg theoretical CS) usually believe that they can have an easier life elsewhere.
If you pick a randomly selected academic or hobby conference, I will be much more surprised that they had circling than if they had food.
Yeah I think this is a pretty important point. I pointed out this before here, here, and here (2 years ago). I personally still enjoyed the game as is. However I’m open to the idea that future Petrov Days should look radically different, and wouldn’t have a gamefying element at all. But I think if we want a game that reflects the structure of Petrov’s decision that day well in an honest way, I personally would probably want something that accounts for the following features:1. Petrov clearly has strong incentives and social pressures to push the button.
2. Petrov is not solely responsible for the world ending, a reasonable person could motivatedly say that it was “someone else’s problem”
It was a dirty job, he thought to himself, but somebody had to do it.As he walked away, he wondered who that somebody will be.
It was a dirty job, he thought to himself, but somebody had to do it.
As he walked away, he wondered who that somebody will be.
3. Everything is a little stressful.
The thing I will enjoy, which may not be to everybody’s taste, would include:
Informed consent before being put in the game (either opt-in or a clear opt-out)Some probability of false alarms (if we do a retaliatory game)No individual is “responsible” for ending the worldAn example setup is if we had 4-person pods, and everybody in the group must launchor a chain of command like Petrov facedmaybe a randomization thing where your button has a 5% of not doing the thing you told it to.Specifically, 5% of buttons are “always on” or “always off” and you get no visual cues of this ahead of time.So this ups the stakes if 3 people chose to press and the fourth person does not.Some reward for pressing the buttoneg $100 to anybody who presses the buttonMaybe no reward if the “world” endseg, nobody from LW gets money if EAF blows up LW, and vice versa.Visible collective reward if world doesn’t endLike $X000 dollars donated to preferred charity.
Informed consent before being put in the game (either opt-in or a clear opt-out)
Some probability of false alarms (if we do a retaliatory game)
No individual is “responsible” for ending the world
An example setup is if we had 4-person pods, and everybody in the group must launch
or a chain of command like Petrov faced
maybe a randomization thing where your button has a 5% of not doing the thing you told it to.
Specifically, 5% of buttons are “always on” or “always off” and you get no visual cues of this ahead of time.
So this ups the stakes if 3 people chose to press and the fourth person does not.
Some reward for pressing the button
eg $100 to anybody who presses the button
Maybe no reward if the “world” ends
eg, nobody from LW gets money if EAF blows up LW, and vice versa.
Visible collective reward if world doesn’t end
Like $X000 dollars donated to preferred charity.
As an example of the difficulties in illusions of transparency, when I first read the post, my first interpretation of “largely fake research” was neither of what you said or what jessicata clarified below but simply assumed that “fake research” ⇒ “untrue,” in the sense that people who updated from >50% of research from those orgs will on average have a worse Brier score on related topics, which didn’t seem unlikely to me on the face of it, since random error, motivated reasoning, and other systemic biases can all contribute to having bad models of the world.
Since 3 people can have 4 different interpretations of the same phrase, this makes me worried that there are many other semantic confusions I didn’t spot.
Are you including productivity/prescription drugs like off-label use of Adderrall or modafinil or only recreational drugs? I think the former is substantially less dangerous, as, among others, there’s at least in theory substantially less motivated reasoning in users for wanting reasons to justify their use.
Agreed, there’s two different errors here. One is conflating total harm with per-individual harm, the other with conflating per-individual harm. The other, more subtle point you’re alluding to is that a lot of the relative harms of alcohol/tobacco/etc has to do with frequency of use, which is a different question from whether doing X once in an individual or community setting is advisable.
I’m confused why there were ~40 comments in this subthread without anybody else pointing out this pretty glaring error of logical inference (unless I’m misunderstanding something)
A 2010 analysis concluded that psychedelics are causing far less harm than legal drugs like alcohol and tobacco. (Psychedelics still carry substantial risks, aren’t for everybody, and should always be handled with care.)
? This is total harm, not per use. More people die of car crashes than from rabid wolves, but I still find myself more inclined to ride cars than ride rabid wolves as a form of transportation.
Just want to register that this comment seemed overly aggressive to me on a first read, even though I probably have many sympathies in your direction (that Leverage is importantly disanalogous to MIRI/CFAR)
What I’m talking about is a system of moral duties and obligations connected to an explicitly academic mission. Academia is older than the corporation, and is a separate world. It’s very important not to confuse them, and I wish that corporations (and “research labs” associated with corporations) would state very clearly “we are in no way an academic institution”.
To be clear, my own organization is a nonprofit. We are not interested in making money, nor in doing other things of low moral value. I currently think emulating the culture of normal companies is a better starting template than academia or other research nonprofits (many of whom have strong positions that they want to believe and research that oh-so-interestingly happen to justify their pre-existing beliefs), though of course different cultures have different poisons that are more or less salient to different people. But yeah, let’s take this offline.
Thanks so much for the response! I really appreciate it.
I’m assuming your institution wants to follow an academic model, including teaching, mentorship, hierarchical student-teacher relationships, etc.
I think we have more of a standard manager-managee hierarchal relationship, with the normal corporate guardrails plus a few more. We also have explicit lines of reporting for abuse or other potential issues to people outside of the organization to minimize potential coverups.
Here are my general thoughts:
An open question is when you have a duty of care
I’m kind of confused. Surely organizations by default have a power dynamic over employees, and managers over reports, and abusing this is bad? Maybe I’m confused and you mean a stronger thing by “duty of care”
Seems straightforwardly true to me, though I think you’re maybe underestimating correlates of direct harm. (eg I expect in many of the cases cited, there’s things like megalomania, insufficient humility, insufficient willingness to listen to contrary evidence, caring more about charismatic personalities than object-level arguments, etc)
Speaking as someone in the subset of “women and minorities”, I’d be pretty concerned about any form of special treatments or affordances given because “women and minorities” are at higher risk, aside from really obvious ones like being moderately more careful about male supervisor/female supervisee.
In particular, this creates bad dynamics/incentive structures, like making it less likely to provide honest/critical feedback to “marginalized” groups, which is one of the things I was warned against in management training.
This seems correct. Also you want multiple trusted points of contact outside the organization, which I think both academia and rationality are failing at.
EA organizations often have Julia Wise, but she’s stretched too thin and thus have (arguably) made significant mistakes as a result, as pointed out in a different thread.
This seems right to me. I think “common sense” should be dereferenced a little for people coming from different cultures, but the company culture of the AngloAmerican elite seems not-crazy as a starting point.
I think it’s Very Bad to allow most forms of lawbreaking on “work time.” But I think you’re implying something much stronger than that, and (speaking as someone who think all recreational drugs are dumb and straightforwardly do not pass any cost-benefits analysis, and have consumed less than a bottle of wine in my entire life) I really don’t think it’s the job of a workplace to police employee’s time off, regardless of whether it’s doing recreational drugs or listening to pirated music.
maybe it’s different if jobs are in person?
But I once worked at a company which had in our code-of-contact that employees can’t drink in parties with other employees, and even though I had no inclination to drink, I still thought that was clearly too crazy/controlling
This seems right. Most companies have rules against managers dating subordinates, and I think for probably good reasons.
This sounds right, though “if you believe” is a probabilistic claim, and if I think the base rate is 5%, I’m not sure you think cutting communication should have at 15% (already ~3x elevated risk!) or 75% or 95%.
I think I agree? But I think your reasoning is shoddy here. “There’s no real correlation between excellence and being abusive” is a population claim, but obviously what people are evaluating is usually individuals.
“Among other things, if your org is aware of (1) through (6), abusers will go elsewhere” One thing I’m confused about is if an organization has credible Bayesian evidence (say 40% is the cutoff) that an employee abuses their reports, it may make sense for the organization to fire them, way before there’s enough evidence to convict in a court of law. But it’s unclear what you should do in the broader ecosystem.
In academia my impression is that professors often switch universities after charges of suspicion, which seems not ideal and not what I’d want to replicate.
Thanks for the outside perspective. If you’re willing to go into more detail, I’m interested in a more detailed account from you on both what academia’s safeguards are and (per gwillen’s comment) where do you think academia’s safeguards fall short and how that can be fixed.
This is decision-relevant to me as I work in a research organization outside of academia (though not working on AI risk specifically), and I would like us to both be more productive than typical in academia and have better safeguards against abuse.
If it helps, we have about 15 researchers now, we’re entirely remote, and we hire typically from people who just finished their PhDs or have roughly equivalent research experience, although research interns/fellows are noticeably younger (maybe right after undergrad is the median).
Thanks, appreciate the update!
Sorry, am I misunderstanding something? I think taking “clinically significant symptoms”, specific to the UC system, as a given is wrong because it did not directly address either of my two criticisms:1. Clinically significant symptoms =/= clinically diagnosed even in worlds where there is a 1:1 relationship between clinically significant symptoms and would have been clinically diagnosed, as many people do not get diagnosed
2. Clinically significant symptoms do not have a 1:1 relationship with would have been clinically diagnosed.
Sorry, maybe this is too nitpicky, but clinically significant symptoms =/= clinically diagnosed, even in worlds where the clinically significant symptoms are severe enough to be diagnosed as such.
If you instead said in “population studies 30-40% of graduate students have anxiety or depression severe enough to be clinically diagnosed as such were they to seek diagnosis” then I think this will be a normal misreading from not jumping through enough links.
Put another way, if someone in mid-2020 told me that they had symptomatic covid and was formally diagnosed with covid, I would expect that they had worse symptoms than someone who said they had covid symptoms and later tested for covid antibodies. This is because jumping through the hoops to get a clinical diagnosis is nontrivial Bayesian evidence of severity and not just certainty, under most circumstances, and especially when testing is limited and/or gatekeeped (which is true for many parts of the world for covid in 2020, and is usually true in the US for mental health).
I want to remind people here that something like 30-40% of grad students at top universities have either clinically diagnosed [emphasis mine] depression or anxiety (link)
I’m confused about how you got to this conclusion, and think it is most likely false. Neither your link, the linked study, or the linked meta-analysis in the linked study of your link says this. Instead the abstract of the linked^3 meta-analysis says:
Among 16 studies reporting the prevalence of clinically significant symptoms of depression across 23,469 Ph.D. students, the pooled estimate of the proportion of students with depression was 0.24 (95% confidence interval [CI], 0.18-0.31; I2 = 98.75%). In a meta-analysis of the nine studies reporting the prevalence of clinically significant symptoms of anxiety across 15,626 students, the estimated proportion of students with anxiety was 0.17 (95% CI, 0.12-0.23; I2 = 98.05%).
Further, the discussion section of the linked^3 study emphasizes:
While validated screening instruments tend to over-identify cases of depression (relative to structured clinical interviews) by approximately a factor of two67,68, our findings nonetheless point to a major public health problem among Ph.D. students.
So I think there is at least two things going on here:
Most people with clinically significant significant symptoms do not go get diagnosed, so “clinically significant symptoms of” depression/anxiety is a noticeably lower bar than “actually clinically diagnosed”
As implied in the quoted discussion above, if everybody were to seek diagnosis, only ~half of the rate of symptomatic people would be clinically diagnosed as having depression/anxiety.
For those keeping score, this is ~12% for depression and 8.5% for anxiety, with some error bars.
Separately, I also think:
my current guess is we are roughly at that same level, or slightly below it
is wrong. My guess is that xrisk reducers have worse mental health on average compared to grad students. (I also believe this, with lower confidence, about people working in other EA cause areas like animal welfare, global poverty, or non-xrisk longtermism, as well as serious rationalists who aren’t professionally involved in EA cause areas).