# What should you change in response to an “emergency”? And AI risk

This post has been recorded as part of the LessWrong Curated Podcast, and can be listened to on Spotify, Apple Podcasts, and Libsyn.

Epistemic status: A possibly annoying mixture of straightforward reasoning and hard-to-justify personal opinions.

It is often stated (with some justification, IMO) that AI risk is an “emergency.” Various people have explained to me that they put various parts of their normal life’s functioning on hold on account of AI being an “emergency.” In the interest of people doing this sanely and not confusedly, I’d like to take a step back and seek principles around what kinds of changes a person might want to make in an “emergency” of different sorts.

# Principle 1: It matters what time-scale the emergency is on

There are plenty of ways we can temporarily increase productivity on some narrow task or other, at the cost of our longer-term resources. For example:

• Skipping meals

• Skipping sleep

• Ceasing to clean the house or to exercise

• Accumulating credit card debt

• Calling in favors from friends

• Skipping leisure time

If I would strongly prefer to address some situation x before time t, I may sometimes want to “borrow from the future” like this. But the time-scales matter. If I’m trying to address x as much as possible in the next five hours, skipping sleep may make sense. If I’m trying to address x as much as possible over the next year, I’ll probably do better to get my usual amount of sleep tonight. Something similar (with different, resource-specific timescales) will hold for other resources.

So, in short time-scale emergencies, it’ll often make sense to suspend a great deal of normal functioning for a short period of time. In longer time-scale emergencies, your life should mostly look closer to normal.

# Principle 2: It matters how much we know how to address the emergency

Much of what we do in daily life – especially when we’re feeling “free” and “unstuck,” and as though there is nothing in particular that we “have to” do – has the effect of making us well-resourced and capable in general. For example, by default, a lot of us would spend a lot of time reading interesting books of varied sorts, nerding out about interesting topics, trying our hands at new skills and crafts, etc. Also, we often like making our living spaces nicer (and more functional), forming new friendships, and so on.

If there is a particular thingy that matters hugely, and if you have an accurate model of how exactly to change that thingy, it may make sense to sacrifice some of your general-purpose capacities in trade for increased ability to address that thingy. (E.g., if you know for a fact that you’ll lose your home unless you pay the mortgage, and if keeping your home is important, it may make sense to trade general-purpose capacities for the ability to make mortgage payments by e.g. working over-long hours at a mind-numbing job that leaves you stupid.)

However, if you don’t have an accurate map of how to address a given thingy, then, even if the thingy is very important, and even if its time-scale is short, you’ll probably mostly want to avoid sacrificing general-purpose capacities. (In a low-information context, your general-purpose capacities are perhaps more likely to turn out helpful for your very important thingy than whatever you’d be trading them off to buy.) Thus, in “emergencies” where you do not have an accurate map of how to solve the emergency, your behavior should probably be more like normal than in better-mapped emergencies.

# Side-note: “Emergencies” as wake-up calls

A different way an “emergency” can sometimes rearrange priorities is by serving as a “wake-up call” that helps people peel away accidental accumulation of habits, “obligations,” complacency, etc. For example, I’m told that near encounters with death sometimes leave people suddenly in touch with what matters in life. (I’ve heard this from one friend who had cancer, and seen accounts from strangers in writing; I’m not sure how common this is or isn’t really.)

I non-confidently suspect some gravitate toward AI risk or other emergencies in the hopes that it’ll help them peel away the cruft, notice a core of caring within themselves, and choose activities that actually make sense. (See also: Something to Protect.) If “AI risk as wake-up call” works out, I could imagine AI risk helping a person rearrange their actions in a way that boosts long-term capacities. (E.g., finding courage; trying new things and paying attention to what happens; cultivating friendships right now rather than in the possibly-non-existent future; facing up to minor social conflicts; attempting research that might yield fresh insights instead of research that feels predictable.)

This sort of “post wake-up call” change is almost the opposite of the kinds of “borrowing from the future” changes that typify short-term emergency responses. You should be able to tell the difference by seeing whether a person’s actions are unusually good for their long-term broad-based capacities (e.g., do they keep their house in a pattern that is good for them, get exercise, engage in the kinds of leisure and study that boost their ability to understand the world, appear unusually willing and able to cut through comfortable rationalizations, etc.?), or unusually bad for their long-term broad-based capacities (e.g., do they live on stimulants, in a house no one would want to live in, while saying they ‘don’t have time’ for exercise or for reading textbooks and seeming kind of burn-out-y and as though they don’t have enough free energy to fully take an interest in something new, etc.?).

# What sort of “emergency” is AI risk?

It seems to me that AI risk is a serious problem in the sense that it may well kill us (and on my personal models, may well not, too). In terms of time-scales, I am pretty ignorant, but I personally will not be too surprised if the highest risk period is in only a couple years, nor if it is in more than thirty years. In terms of how accurate our maps of what to do are, it seems to me that our maps are not accurate; most people who are currently burning themselves out to try to help with AI risk on some particular path might, for all I know, contribute at least as well (even on e.g. a 5-year timescale) if they built general capacities instead.

I therefore mostly suspect that we’ll have our best shot at AI risk if we, as a community, cultivate long-term, robust capacities. (For me, this is hinging more on believing we have poor maps of how to create safety, and less on beliefs about time-scale.) Our best shot probably does mean paying attention to AI and ML advances, and directing some attention that way compared to what we’d do in a world where AI did not matter. It probably does mean doing the obvious work and the obvious alignment experiments where we know what those are, and where we can do this without burning out our long-term capacities. But it mostly doesn’t mean people burning themselves out, or depleting long-term resources in order to do this.

# Some guesses at long-term, robust-in-lots-of-scenarios resources that may help with AI risk

Briefly, some resources I’d like us to have, as AI approaches:

• Accurate trust in one another’s words. (Calibration, honesty, accurate awareness of one another’s honesty, valuing of truth over comfort or rationalizations. Practice seeing one another get things right and wrong in varied domains.)

• Integrity. The ability to reason, tell the truth, and do what matters in the face of pain, fear, etc.

• Practice having pulled off a variety or projects (not necessarily AI projects). (In my book, we get points for e.g. creating: movies; books; charter cities or other small or large-scale political experiments; buildings and communities and software; and basically anything else that involves reusable skills and engagement with the world.)

• Practice accomplishing things in groups.

• Time. Having AI not as advanced as in alternate scenarios. Having chip manufacture not as advanced as in alternate scenarios.

• Spiritual health. Ability to love, to care about that which matters, to laugh and let go of rationalizations, to hope and to try, to form deep friendships.

• Deep STEM knowledge, ability to do science of varied sorts, ability to do natural philosophy. (Not necessarily all AI.)

• Engineering skill and practice.

• An understanding of the wider cultural context we’re in, and of how to interact with it and what to expect.

• All kinds of ML-related skills and resources, although this is in some tension with wanting time.

Given the above, I am excited about people in our larger orbit following their interests, trying to become scientists or writers or engineers or other cool things, exploring and cultivating. I am excited about people attempting work on AI alignment, or on other angles on AI safety, while also having hobbies and interests and friendships. For the most part I am not excited about people burning themselves out working super-long hours on alignment tasks in ways that damp their ability to notice new things or respond to later challenges, although I am excited about people pushing their limits doing work they’re excited by, and these things can blur together.

Also, I am excited about people trying to follow paths to all of their long-term goals/​flourishing, including their romantic and reproductive goals, and I am actively not excited about people deciding to shelve that because they think AI risk demands it. This is probably the hardest to justify of my opinions, but, like Divia in this tweet, I am concerned (based partly on personal experience, partly on observations of others, and partly on priors/​models) that when people try to table their deepest personal goals, this messes up their access to caring and consciousness in general.

# Why do we see burnout in AI safety efforts and in effective altruism?

IMO, I see burnout (people working hard at the expense of their long-term resources and capacities) more often than I expect is optimal for helping with AI risk. I’m not sure why. Some of it is in people who (IMO) have a better shot and a better plan than most for reducing AI risk, which makes it more plausibly actually-helpful for those folks to be doing work at the expense of long-term capacity, though I have my doubts even there. I suspect much of it is for confused/​mistaken reasons, though; I sometimes see EAs burning themselves out doing e.g. random ML internships that they think they should work hard on because of AI risk, and AFAICT this trade does not make sense. Also, when I look at the wider world (e.g. less-filtered chunks of Twitter; conversations with my Lyft drivers) I see lots of people acting as though lots of things are emergencies worth burning long-term resources for, in ways I suspect do not make sense, and I suspect that whatever dynamics lead to that may also be involved here. I’ve also heard a number of people tell me that EA or AI safety efforts caused them to lose the ability to have serious hobbies, or serious intellectual interests, and I would guess this was harmful to long-term AI safety potential in most cases. (This “lots of people lose the ability to have serious hobbies when they find EA” is something I care very much about. Both as damage to our movements’ potential, and as a symptom of a larger pattern of damage. I’ve heard this, again, from many, though not everyone.)

I’d be curious for y’all’s takes on any of the above!

• One substantive issue I didn’t manage to work into the OP, but am interested in, is a set of questions about memetics and whether memetics is one of the causes of how urgent so many people seem to find so many causes.

A section I cut from the OP, basically because it’s lower-epistemic-quality and I’m not sure how relevant it is or isn’t to the dynamics I kept in the OP, but that I’d like to throw into the comments section for discussion:
--

### Memetics sometimes leads to the amplification of false “emergencies”

Once upon a time, my former housemate Steve Rayhawk answered the door, found a Jehovah’s witness there to proselytize, and attempted to explain to the Jehovah’s witness about memetics. (“Okay, so, you know how internet chain letters often promise all kinds of goods? Well, suppose you find yourself going door-to-door with a message….”)

I’d like to make a similar point.

Basically: it seems to me that when I venture into less-filtered portions of Twitter, or ask my Lyft drivers what they think is up in the world, or otherwise encounter things from some portions of memes-at-large… I encounter a rather high proportion of “there is currently an Emergency” overtones (about all sorts of things, mostly not AI). I suspect these messages get passed on partly because the structure of “Emergency! Fighting this thing now is worth giving up some of your leisure time today, taxing your friends a bit if they’re annoyed by your passing it on, etc.” gets messages replicated some in our current context. I suspect they cause a fair amount of dead weight loss in aggregate, with a fair amount of people choosing to try to respond to “emergencies” that are not really emergencies, with actions that don’t make much sense, instead of boosting their and others’ long-term capacities to understand and to act well from deep within.

An added point here is that people are often worse at decisions when rushed, which means that bad arguments can sometimes get forwarded if they keep their passers-on from taking full stock of them. (At least, I’m told this is a classic trick of con artists, and that the heuristics and biases literature found people are extra subject to most biases when rushed.) So it may be useful to ask whether the fact of a particular claim-to-urgency reaching you is based in significant part on people passing on the message without full digestion under the influence of fear/​urgency, or whether it is mostly reaching you via the slow conscious actions of people at their best.

Memetic arguments do not mean that any particular claim about a thing being an emergency is false. If you’re trying to figure out what’s true, there’s no substitute for hugging the query, remembering that argument screens off authority, and plunging into object-level questions about what the world is like. Direct engagement with the object-level questions is also often far more interesting/​productive.

But IMO, memetic arguments probably do mean you should be a bit on guard when evaluating arguments about emergencies, should be more hesitant to take a person’s or group’s word for what the “emergency” is (especially where the message claims that you should burn yourself out, or deplete your long-term resources, in such a way as to boost that message or its wielders), and should more insist on having an inside-view that makes sense. I.e., it seems to me that one should approach messages of the form “X is an emergency, requiring specific action Y from you” a bit more like the way most of us already approach messages of the form “you should give me money.

Also, it does not hurt to remember that “X is in a bad way” is higher-prior than “X is in a bad way, and you should burn out some of your long-term resources taking action A that allegedly helps with X.”

A different main heuristic I’d like to recommend here, is trying to boot all the way to slow, reflective consciousness as a way to check the substance of any claim that seems to be trying to get you into an urgent state of mind, or a state of mind from which it is less natural to allow things to evaluate slowly. I really like and agree with Raemon’s post about slack as a context in which you can actually notice what your mind should/​shouldn’t be on. I really hope we all get this at least several times a year, if not all the time. (One could also try to use slow consciousness to “spot check” claimed urgencies after the fact, even if acting more-rapidly temporarily.)

• You managed to cut precisely the part of the post that was most useful for me to read :)

(To be clear, putting it in this comment was just as helpful, maybe even more-so.)

• I love this post a lot.

Also, I am excited about people trying to follow paths to all of their long-term goals/​flourishing, including their romantic and reproductive goals, and I am actively not excited about people deciding to shelve that because they think AI risk demands it.

I’m going to reply in a pedantic way, then say why I did so.

Pedantic response: Trying to solve AI risk is one way to increase the odds you’ll achieve goals like that. (E.g., if you want to support your kids going to college, you should consider putting aside money for them and keeping an eye out for ways to reduce the risk you and your kids die from rogue AI disassembling the inner solar system before they’re of college age.)

Explanation:

I make this point not to argue against finding love or starting a family, but to argue against a mindset that treats AGI and daily life as more or less two different magisteria.

I think there’s a pendulum-swing-y thing going on: people trying to counter burnout, freakouts, etc. heavily emphasize mundane, non-sci-fi-sounding reality, so as to push back a pendulum they see as having swung too far toward ‘my mental universe is entirely about Dyson swarms, and not at all about dental appointments’.

I find this mostly doesn’t work for me personally. I come away feeling like half the blog posts I read are disassociated from ordinary day-to-day life, and the other half are disassociated from ‘the portion of my life that will occur during and after the invention of this wild technology’, such that I end up wary of both kinds of post.

Your overall post is a whole lot better than most on this dimension. It doesn’t just treat ‘be psychologically happy and healthy this year’ as obviously lexicographically more important than preventing your and everyone else’s deaths in twenty-five years (or whatever). It gives specific arguments for why the happy-and-healthy stuff is super important even if you’re confident your personal CEV’s terminal values assign overwhelmingly more weight to x-risk stuff than to your happiness over the next fifty years. It acknowledges the “‘Emergencies’ as wake-up calls” thing.

But it still feels to me like it’s a post trying to push the pendulum in a particular direction, rather than trying to fully and openly embody the optimal-by-your-lights Balancing Point.

(Where de-prioritizing pendulum pushing indeed risks worsening the current problem we face, insofar as EAs today do over-extend more than they under-extend. Maybe I just selfishly prefer reading posts like that, even if they’re not optimal for the community.)

It still doesn’t feel to me like it’s fully speaking as though the two worlds as one world, or fully embracing that a lot of people probably need to hear something closer to the opposite advice. (‘You’re asking too little of yourself; there are great things you could achieve if you were more strategic and purposeful with your time, rather than just following the nearest hedonic gradient; you are not fragile, or a ticking time bomb that will surely flame out if you start doing qualitatively harder things; you can output good-by-maxipok-lights things without waiting to first resolve all your personal issues or get your life fully in order; the alignment problem may sound crazy and intense to you from a distance, and it still possibly be true that you could do good work without sacrificing a flourishing, well-rounded pre-AGI life, and perhaps even flourish more pre-AGI as a consequence.’)

That might just be because I’m not the target audience and this style works great for others, but I wanted to mention it.

• I make this point not to argue against finding love or starting a family, but to argue against a mindset that treats AGI and daily life as more or less two different magisteria….

It still doesn’t feel to me like it’s fully speaking as though the two worlds as one world

The situation is tricky, IMO. There is, of course, at the end of the day only one world. If we want to have kids who can grow up to adulthood, and who can have progeny of their own, this will require that there be a piece of universe hospitable to human life where they can do that growing up.

At the same time:

a) IMO, there is a fair amount of “belief in belief” about AI safety and adjacent things. In particular, I think many people believe they ought to believe that various “safety” efforts help, without really anticipating-as-if this sort of thing can help.

(The argument for “be worried about the future” is IMO simpler, more obvious, and more likely to make it to the animal, that particular beliefs that particular strategies have much shot. I’m not sure how much this is or isn’t about AI; many of my Uber drivers seem weirdly worried about the future.)

a2) Also, IMO, a fair number of peoples’ beliefs (or “belief in beliefs”) about AI safety are partly downstream of others’ political goals, e.g. of others’ social incentives that those people believe in particular narratives about AI safety and about how working at place X can help with AI safety. This can accent the “belief in belief” thing.

b) Also, even where people have honest/​deep/​authentic verbal-level beliefs about a thing, it often doesn’t percolate all the way down into the animal. For example, a friend reports having interviewed a number of people about some sex details, and coming out believing that some people do and some people don’t have a visceral animal-level understanding that birth control prevents pregnancy, and reports furthermore that such animal-level beliefs often backpropagate to peoples’ desires or lack of desires for different kinds of sex. I believe my friend here, although this falls under “hard to justify personal beliefs.”

c) As I mentioned in the OP, I am worried that when a person “gives up on” their romantic and/​or reproductive goals (or other goals that are as deeply felt and that close to the center of a person, noting that the details here vary by individual AFAICT), this can mess up their access to caring and consciousness in general (in Divia’s words, can risk “some super central sign error deep in their psychology”).

a-c, in combination, leave me nervous/​skeptical about people saying that their plan is to pursue romance/​children “after the Singularity,” especially if they’re already nearing the end of their biological window. And especially if it’s prompted by that being a standard social script in some local circle. I am worried that people may say this, intend it with some portion of themselves, but have their animal hear “I’m going to abandon these goals and live in belief-in-beliefs.”

I have a personal thing up for me about this one. Back in ~2016, I really wanted to try to have a kid, but thought that short timelines plus my own ability to contribute to AI safety efforts meant I probably shouldn’t. I dialoged with my system 1 using all the tools in the book. I consulted all the people near me who seemed like they might have insight into the relevant psychology. My system 1 /​ animal-level orientation, after dialog, seemed like it favored waiting, hoping for a kid on the other side of the singularity. I mostly passed other peoples’ sanity checks, both at the time and a few years later when I was like “hey, we were worried about this messing with my psyche, but it seems like it basically worked, right? what is your perception?” And even so, IMO, I landed in a weird no man’s land of a sort of bleached out depression and difficulty caring about anything after awhile, that was kinda downstream of this but very hard for me to directly perceive, but made it harder to really mean anything and easier to follow the motions of looking like I was trying.

The take-away I’m recommending from this is something like: “be careful about planning on paths toward your deepest, animal-level goals that your animal doesn’t buy. And note that it’s can be hard to robustly know what your animal is or isn’t buying. ” Also, while there’s only one magesterium, if people are animal-level and speech-level reasoning as though there’s several, that’s a real and confusing-to-me piece of context to how we’re living here.

• That makes sense to me, and it updates me toward your view on the kid-having thing. (Which wasn’t the focus of my comment, but is a thing I was less convinced of before.) I feel sad about that having happened. :( And curious about whether I (or other people I know) are making a similar mistake.

(My personal state re kids is that it feels a bit odd/​uncanny when I imagine myself having them, and I don’t currently viscerally feel like I’m giving something up by not reproducing. Though if I lived for centuries, I suspect I’d want kids eventually in the same way I’d want to have a lot of other cool experiences.)

I feel kinda confused about how “political” my AGI-beliefs are. The idea of dying to AGI feels very sensorily-real to me — I feel like my brain puts it in the same reference class as ‘dying from a gunshot wound’, which is something I worry about at least a little in my day-to-day life (even though I live in a pretty safe area by US megalopolis standards), have bad dreams about, semi-regularly idly imagine experiencing, etc. I don’t know how that relates to the “is this belief political?” question, or how to assess that.

Regardless, I like this:

I am worried that people may say this, intend it with some portion of themselves, but have their animal hear “I’m going to abandon these goals and live in belief-in-beliefs.”

I’d also assume by default that ‘the animal discounts more heavily than the philosopher does’ is a factor here...? And/​or ‘the animal is better modeled on this particular question as an adaptation-executer rather than a utility-maximizer, such that there’s no trade you can make that will satisfy the animal if it involves trading away having-kids-before-age-45’?

It could be wise to have kids, for the sake of harmony between your parts/​subgoals, even if you judge that having kids isn’t directly x-risk-useful and your animal seems to have a pretty visceral appreciation for x-risk—just because that is in fact what an important part of you wants/​needs/​expects/​etc.

• But it still feels to me like it’s a post trying to push the pendulum in a particular direction, rather than trying to fully and openly embody the optimal-by-your-lights Balancing Point.

AFAICT, I am trying to fully and openly embody the way of reasoning that actually makes sense to me in this domain, which… isn’t really a “balancing point.” It’s more like the anarchist saying “the means are the ends.” Or it’s more like Szilard’s “ten commandments,” (which I highly recommend reading for anyone who hasn’t; they’re short). Or more like the quote from the novel The Dispossessed: “To reassert its validity and strength, he thought, one need only act, without fear of punishment and without hope of reward: act from the center of one’s soul. “

I don’t have the right concepts or articulation here. This is an example of the “hard to justify personal opinions” I warned about in my “epistemic status.” But IMO, thinking about tradeoffs and balancing points can be good when your map is good enough; at other times, it’s more like I want to try to hone in on priors, on deep patternness, on where reasoning is before it’s reasoning. This is where the power of leisure comes from, where the possibility of hobbies that end up giving you glimpses of new bits of the universe come from. And it’s a thing I’m trying to show here. Not a particular balance-point between depleting your long-term resources and “being nice to yourself” by eating chocolates and cartoons. (Neither of those help with getting to the tao, usually, AFAICT.)

We can be empirical about trying to see which actions, which mindsets, add to our and others’ long-term robust abilities.

In short-term crises for which you have decent-quality maps, balance-points and trading things off with local consequentialist reasoning makes sense to me. But not the rest of everywhere.

I agree many people underestimate their own capacities, and too seldom try hard or scary things. I think this is often many of the same people who burn themselves out.

Sorry this reply, and my other one, are somewhat incoherent. I’m having trouble mapping both where you’re coming from, and why/​where I disagree.

• Yeah, that makes sense to me. I’m complaining about a larger class of posts, so maybe this one isn’t really an example and I’m just pattern-matching. I do still wish there existed more posts that were very obviously examples of the ‘both-and’ things I was pointing at. (Both dentist appointments and Dyson spheres; both embrace slack and embrace maxipok; etc.)

It might be that if my thinking were clearer here, I’d be able to recognize more posts as doing ‘both-and’ even if they don’t explicitly dwell on it as much as I want.

• I want to have a dialog about what’s true, at the level of piece-by-piece reasoning and piece-by-piece causes. I appreciate that you Rob are trying to do this; “pedantry” as you put it is great, and seems to me to be a huge chunk of why LW is a better place to sort some things out than is most of the internet.

I’m a bit confused that you call it “pedantry”, and that you talk of my post as trying to push the pendulum in a particular way, and “people trying to counter burnout,” and whether this style of post “works” for others. The guess I’m forming, as I read your (Rob’s) comment and to a lesser extent the other comments, is that a bunch of people took my post as a general rallying cry against burnout, and felt it necessary to upvote my post, or to endorse my post, because they personally wish to take a stand against burnout. Does something like that seem right/​wrong to anyone? (I want to know.)

I… don’t want that, although I may have done things in my post to encourage it anyhow, without consciously paying attention. But if we have rallying cries, we won’t have the kind of shared unfiltered reasoning that someone wanting truth can actually update on.

I’m in general pretty interested in strategies anyone has for having honest, gritty, mechanism-by-mechanism discussion near a Sacred Value. “Don’t burn people out” is arguably a Sacred Value, such that it’ll be hard to have open conversation near it in which all the pedantry is shared in all the directions. I’d love thoughts on how to do it anyhow.

• I want to have a dialog about what’s true, at the level of piece-by-piece reasoning and piece-by-piece causes. I appreciate that you Rob are trying to do this; “pedantry” as you put it is great, and seems to me to be a huge chunk of why LW is a better place to sort some things out than is most of the internet.

Yay! I basically agree. The reason I called it “pedantry” was because I said it even though (a) I thought you already believed it (and were just speaking imprecisely /​ momentarily focusing on other things), (b) it’s an obvious observation that a lot of LWers already have cached, and (c) it felt tangential to the point you were making. So I wanted to flag it as a change of topic inspired by your word choice, rather than as ordinary engagement with the argument I took you to be making.

and that you talk of my post as trying to push the pendulum in a particular way, and “people trying to counter burnout,” and whether this style of post “works” for others.

I think I came to the post with a long-term narrative (which may have nothing to do with the post):

• There are a bunch of (Berkeley-ish?) memes in the water related to radical self-acceptance, being kind to yourself, being very wary of working too hard, staying grounded in ordinary day-to-day life, being super skeptical and cautious around Things Claiming To Be Really Important and around moralizing, etc.

I think these are extremely important and valuable memes that LW would do well to explore, discuss, and absorb much more than it already has. I’ve found them personally extremely valuable, and a lot of my favorite blog posts to send to new EAs/​rats make points like ‘be cautious around things that make big moralistic demands of you’, etc.

But I also think that these kinds of posts are often presented in ways that compete against the high value I (genuinely, thoughtfully) place on the long-term future, and the high probability I (genuinely, thoughtfully) place on AI killing me and my loved ones, as though I need to choose between the “chill grounded happy self-loving unworried” aesthetic or the “working really hard to try to solve x-risk” aesthetic.

This makes me very wary, especially insofar as it isn’t making an explicit argument against x-risk stuff, but is just sort of vaguely associating not-worrying-so-much-about-human-extinction with nice-sounding words like ‘healthy’, ‘grounded’, ‘relaxed’, etc. If these posts spent more time explicitly arguing for their preferred virtues and for why those virtues imply policy X versus policy Y, rather than relying on connotation and implicature to give their arguments force, my current objection would basically go away.

If more of the “self-acceptance, be kind to yourself, be vary wary of working too hard, etc.” posts were more explicit about making space for possibilities like ‘OK, but my best self really does care overwhelmingly more about x-risk stuff than everything else’ and/​or ‘OK, but making huge life-changes to try to prevent human extinction really is the psychologically healthiest option for me’, I would feel less suspicious that some of these posts are doing the dance wrong, losing sight of the fact that both magisteria are real, are part of human life.

I may have been primed to interpret this post in those terms too much, because I perceived it to be a reaction to Eliezer’s recent doomy-sounding blog posts (and people worrying about AI more than usual recently because of that, plus ML news, plus various complicated social dynamics), trying to prevent the community from ‘going too far’ in certain directions.

I think the post is basically good and successful at achieving that goal, and I think it’s a very good goal. I expect to link to the OP post a lot in the coming months. But it sounds like I may be imposing context on the post that isn’t the way you were thinking about it while writing it.

• I may have been primed to interpret this post in those terms too much, because I perceived it to be a reaction to Eliezer’s recent doomy-sounding blog posts (and people worrying about AI more than usual recently because of that, plus ML news, plus various complicated social dynamics), trying to prevent the community from ‘going too far’ in certain directions. … But it sounds like I may be imposing context on the post that isn’t the way you were thinking about it while writing it.

Oh, yeah, maybe. I was not consciously responding to that. I was consciously responding to a thing that’s been bothering me quite a bit about EA for ~5 or more years, which is that there’s not enough serious hobbies around here IMO, and also people often report losing the ability to enjoy hanging out with friends, especially friends who aren’t in these circles, and just enjoying one anothers’ company while doing nothing, e.g. at the beach with some beers on a Saturday. (Lots of people tell me they try allocating days to this, it isn’t about the time, it’s about an acquired inability to enter certain modes.)

Thanks for clarifying this though, that makes sense.

I have some other almost-written blog posts that’re also about trying to restore access to “hanging out with friends enjoying people” mode and “serious hobbies” mode, that I hope to maybe post in the next couple weeks.

Back in ~2008, I sat around with some others trying to figure out: if we’re successful in getting a lot of people involved in AI safety—what can we hope to see at different times? And now it’s 2022. In terms of “there’ll be a lot of dollars people are up for spending on safety”, we’re basically “hitting my highest 2008 hopes”. In terms of “there’ll be a lot of people who care”, we’re… less good than I was hoping for, but certainly better than I was expecting. “Hitting my 2008 ‘pretty good’ level.” In terms of “and those people who care will be broad and varied and trying their hands at making movies and doing varied kinds of science and engineering research and learning all about the world while keeping their eyes open for clues about the AI risk conundrum, and being ready to act when a hopeful possibility comes up” we’re doing less well compared to my 2008 hopes. I want to know why and how to unblock it.

• In terms of “and those people who care will be broad and varied and trying their hands at making movies and doing varied kinds of science and engineering research and learning all about the world while keeping their eyes open for clues about the AI risk conundrum, and being ready to act when a hopeful possibility comes up” we’re doing less well compared to my 2008 hopes. I want to know why and how to unblock it.

I think to the extent that people are failing to be interesting in all the ways you’d hoped they would be, it’s because being interesting in those ways seems to them to have greater costs than benefits. If you want people to see the benefits of being interesting as outweighing the costs, you should make arguments to help them improve their causal models of the costs, and to improve their causal models of the benefits, and to compare the latter to the former. (E.g., what’s the causal pathway by which an hour of thinking about Egyptology or repairing motorcycles or writing fanfic ends up having, not just positive expected usefulness, but higher expected usefulness at the margin than an hour of thinking about AI risk?) But you haven’t seemed very interested in explicitly building out this kind of argument, and I don’t understand why that isn’t at the top of your list of strategies to try.

• I think of this in terms of personal vs. civilization-scale value loci distinction. Personal-scale values, applying to individual modern human minds, speaking of those minds, might hold status quo anchoring sacred and dislike presence of excessive awareness of disruptive possible changes. While civilization-scale values, even as they are facilitated by individuals, do care about accurate understanding of reality regardless of what it says.

People shouldn’t move too far towards becoming decision theoretic agents, even if they could, other than for channeling civilization. The latter is currently a necessity (that’s very dangerous to neglect), but it’s not fundamentally a necessity. What people should move towards is a more complicated question with some different answer (which does probably include more clarity in thinking than is currently the norm or physiologically possible, but still). People are vessels of value, civilization is its custodian. These different roles call for different shapes of cognition.

In this model, it’s appropriate /​ morally-healthy /​ intrinsically-valuable for people to live more fictional lives (as they prefer) while civilization as a whole is awake, and both personal-scale values and civilization-scale values agree on this point.

• I think I feel a similar mix of love and frustration for your comment as I read your comment expressing with the post.

Let me be a bit theoretical for a moment. It makes sense for me to think of utilities as a sum where is the utility of things after singularity/​superintelligence/​etc and the utility for things before then (assuming both are scaled to have similar magnitudes so the relative importance is given by the scaling factors). There’s no arguing about the shape of these or what factors people chose because there’s no arguing about utility functions (although people can be really bad at actually visualizing ).

Separately form this we have actions that look like optimizing for (e.g. AI Safety research and raising awareness), and those that look like optimizing for (e.g. having kids and investing in/​for their education). The post argues that some things that look like optimizing for are actually very useful for optimizing (as I understand, it mostly because AI timelines are long enough and the optimization space muddled enough that most people contribute more in expectation from maintaining and improving their general capabilities in a sustainable way at the moment).

Your comment (the pedantic response part) talks about how optimizing for is actually very useful for optimizing . I’m much more sceptical of this claim. The reason is due to expected impact per unit of effort. Let’s consider the sending your kids to college. It looks like top US colleges cost around $50k more per year than state schools, adding up to$200k for a four year programme. This is maybe not several times better as the price tags suggests, but if your child is interested and able to get in to such a school it’s probably at least 10% better (to be quite conservative). A lot of people would be extremely excited for an opportunity to lower the existential risk from AI by 10% for 200k. Sure, sending your kids to college isn’t everything there is to , but it looks like the sign remains the same for a couple of orders of magnitude. Your talk of a pendulum makes it sound like you want to create a social environment that incentivizes things that look like optimizing for regardless of whether they’re actually in anyone’s best interest. I’m sceptical of trying to get anyone to act against their interests. Rather than make everyone signal that it makes more sense to have space for people with or even to optimize for their values and extract gains from trade. A successful AI Safety project probably looks a lot more like a network of very different people figuring out how to collaborate for mutual benefit than a cadre of self-sacrificing idealists. • I chose the college example because it’s especially jarring /​ especially disrespectful of trying to separate the world into two “pre-AGI versus post-AGI” magisteria. A more obvious way to see that x-risk matters for ordinary day-to-day goals is that parents want their kids to have long, happy lives (and nearly all of the variance in length and happiness is, in real life, dependent on whether the AGI transition goes well or poorly). It’s not a separate goal; it’s the same goal, optimized without treating ‘AGI kills my kids’ as though it’s somehow better than ‘my kids die in a car accident’. Your talk of a pendulum makes it sound like you want to create a social environment that incentivizes things that look like optimizing for regardless of whether they’re actually in anyone’s best interest. I and my kids not being killed by AGI is in my best interest! A successful AI Safety project probably looks a lot more like a network of very different people figuring out how to collaborate for mutual benefit than a cadre of self-sacrificing idealists. Not letting AGI kill me and everyone I love isn’t the “self-sacrificing” option! Allowing AGI to kill me is the “self-sacrificing” option — it is literally allowing myself to be sacrificed, albeit for ~zero gain. (Which is even worse than sacrificing yourself for a benefit!) I’m not advocating for people to pretend they’re more altruistic than they are, and I don’t see myself as advocating against any of the concrete advice in the OP. I’m advocating for people to stop talking/​thinking as though post-AGI life is a different magisterium from pre-AGI life, or as though AGI has no effect on their ability to realize the totally ordinary goals of their current life. I think this would help with shrugging-at-xrisk psychological factors that aren’t ‘people aren’t altruistic enough’, but rather ‘people are more myopic than they wish they were’, ‘people don’t properly emotionally appreciate risks and opportunities that are novel and weird’, etc. • I’m advocating for people to stop talking/​thinking as though post-AGI life is a different magisterium from pre-AGI life Seems undignified to pretend that it isn’t? The balance of forces that make up our world isn’t stable. One way or the other, it’s not going to last. It would certainly be nice, if someone knew how, to arrange for there to be something of human value on the other side. But it’s not a coincidence that the college example is about delaying the phase transition to the other magisterium, rather than expecting as a matter of course that people in technologically mature civilizations will be going to college, even conditional on the somewhat dubious premise that technologically mature civilizations have “people” in them. • The physical world has phase transitions, but it doesn’t have magisteria. ‘Non-overlapping magisteria’, as I’m using the term, is a question about literary genres; about which inferences are allowed to propagate or transfer; about whether a thing feels near-mode or far-mode; etc. The idea of “going to college” post-AGI sounds silly for two distinct reasons: 1. The post-singularity world will genuinely be very different from today’s world, and institutions like college are likely to be erased or wildly transformed on relatively short timescales. 2. The post-singularity world feels like an inherently “far-mode world” where everything that happens is fantastic and large-scale; none of the humdrum minutiae of a single person’s life, ambitions, day-to-day routine, etc. This includes ‘personal goals are near, altruistic goals are far’. 1 is reasonable, but 2 is not. The original example was about “romantic and reproductive goals”. If the AGI transition goes well, it’s true that romance and reproduction may work radically differently post-AGI, or may be replaced with something wild and weird and new. But it doesn’t follow from this that we should think of post-AGI-ish goals as a separate magisterium from romantic and reproductive goals. Making the transition to AGI go well is still a good way to ensure romantic and reproductive success (especially qua “long-term goals/​flourishing”, as described in the OP), or success on goals that end up mattering even more to you than those things, if circumstances change in such a way that there’s now some crazy, even better posthuman opportunity that you prefer even more. (I’m assuming here that we shouldn’t optimize goals like “kids get to go to college if they want” in totally qualitatively different ways than we optimize “kids get to go to college if they want, modulo the fact that circumstances might change in ways that bring other values to the fore instead”. I’m deliberately choosing an adorably circa-2022 goal that seems especially unlikely to carry over to a crazy post-AGI world, “college”, because I think the best way to reason about a goal like that is similar to the best way to reason about other goals where it’s more uncertain whether the goal will transfer over to the new phase.) • When you have a self-image as a productive, hardworking person, the usual Marshmallow Test gets kind of reversed. Normally, there’s some unpleasant task you have to do which is beneficial in the long run. But in the Reverse Marshmallow Test, forcing yourself to work too hard makes you feel Good and Virtuous in the short run but leads to burnout in the long run. I think conceptualizing of it this way has been helpful for me. • Yes! I am really interested in this sort of dynamic; for me things in this vicinity were a big deal I think. I have a couple half-written blog posts that relate to this that I may manage to post over the next week or two; I’d also be really curious for any detail about how this seemed to be working psychologically in you or others (what gears, etc.). I have been using the term “narrative addiction” to describe the thing that in hindsight I think was going on with me here—I was running a whole lot of my actions off of a backchain from a particular story about how my actions mattered, in a way that weakened my access to full-on leisure, non-backchained friendship, and various other good ways to see and do stuff. One of the tricky bits here IMO is that it’s hard to do non-backchained stuff on purpose, by backchaining. Some other trick is needed to get out of a given trying-to-link-everything-to-a-particular-narrative box. Critch has two older blog posts I quite like on related subjects: • My best guess at mechanism: 1. Before, I was a person who prided myself on succeeding at marshmallow tests. This caused me to frame work as a thing I want to succeed on, and work too hard. 2. Then, I read Meaningful Rest and Replacing Guilt, and realized that often times I was working later to get more done that day, even though it would obviously be detrimental to the next day. This makes the reverse marshmallow test dynamic very intuitively obvious. 3. Now I am still a person who prides myself on my marshmallow prowess, but hopefully I’ve internalized an externality or something. Staying up late to work doesn’t feel Good and Virtuous, it feels Bad and like I’m knowingly Goodharting myself. Note that this all still boils down to narrative-stuff. I’m nowhere near the level of zen that it takes to Just Pursue The Goal, with no intermediating narratives or drives based on self-image. I don’t think this patch has been particularly moved me towards that either, it’s just helpful for where I’m currently at. • Pain is Not The Unit of Effort as well as the “Believing in free will to earn merit” example under Beliefs as Emotional Strategies also seem relevant. • 20 Jul 2022 19:13 UTC 35 points 5 ∶ 13 (I’m very unconfident about my takes here) IMO, I see burnout (people working hard at the expense of their long-term resources and capacities) more often than I expect is optimal for helping with AI risk. How often do you think is optimal, if you have a quick take? I unconfidently think it seems plausible that there should be high levels of burnout. For example, I think there are a reasonable number of people who are above the hiring bar if they can consistently work obsessively for 60 hours a week, but aren’t if they only work 35 hours a week. Such a person trying to work 60 hours a week (and then burning out and permanently giving up on working on alignment if it turns out that this is unsustainable for them) seems like plausibly the EV maximizing move, especially if you have steep discount rates on labor (e.g. because you think many more people will want to work on alignment in the future). • I appreciate this comment a lot. Thank you. I appreciate that it’s sharing an inside view, and your actual best guess, despite these things being the sort of thing that might get social push-back! My own take is that people depleting their long-term resources and capacities is rarely optimal in the present context around AI safety. My attempt to share my reasoning is pretty long, sorry; I tried to use bolding to make it skimmable. ### In terms of my inside-view disagreement, if I try to reason about people as mere means to an end (e.g. “labor”): 0. A world where I’d agree with you. If all that would/​could impact AI safety was a particular engineering project (e.g., Redwood’s ML experiments, for concreteness), and if the time-frame of a person’s relevance to that effort was relatively short (e.g., a year or two, either because AI was in two years, or because there would be an army of new people in two years), I agree that people focusing obsessively for 60 hours/​week would probably produce more than the same people capping their work at 35 hrs/​week. But (0) is not the world we’re in, at least right now. Specific differences between a world where I’d agree with you, and the world we seem to me to be in: 1. Having a steep discount rate on labor seems like a poor predictive bet to me. I don’t think we’re within two years of the singularity; I do think labor is increasing but not at a crazy rate; and a person who keeps their wits and wisdom about them, who pays attention and cares and thinks and learns, and especially someone who is relatively new to the field and/​or relatively young (which is the case for most such engineers I think), can reasonably hope to be more productive in 2 years than they are now, which can roughly counterbalance the increase (or more than counterbalance the increase) on my best guess. E.g., if they get hired and Redwood and then stay there, you’ll want veterans a couple years later who already know your processes and skills. (In 2009, I told myself I needed only to work hard for ~5 years, maybe 10, because after that I’d be a negligible portion of the AI safety effort, so it was okay to cut corners. I still think I’m a non-negligible portion of the effort.) 1.1. Trying a thing to see if it works (e.g. 60 hrs/​week of obsession, to see how that is) might still be sensible, but more like “try it and see if it works, especially if that risk and difficulty is appealing, since “appealingness” is often an indicator that a thing will turn out to make sense /​ to yield useful info /​ to be the kind of thing one can deeply/​sincerely try rather than forcing oneself to mimic, etc.” not like “you are nothing and don’t matter much after two years, run yourself into the ground while trying to make a project go.” I suppose your question is about accepting a known probability of running yourself into the ground, but I’m having trouble booting that sim; to me the two mindsets are pretty different. I do think many people are too averse to risk and discomfort; but also that valuing oneself in the long-term is correct and important. Sorry if I’m dodging the question here. 2. There is no single project that is most of what matters in AI safety today, AFAICT. Also, such projects as exist are partly managerially bottlenecked. And so it isn’t “have zero impact” vs “be above Redwood’s/​project such-and-such’s hiring line,” it is “be slightly above a given hiring line” (and contribute the difference between that spot and the person who would fill it next, or between that project having one just-above-margin person and having one fewer but more managerial slack) vs “be alive and alert and curious as you take an interest in the world from some other location”, which is more continuous-ish. 3. We are confused still, and the work is often subtle, such that we need people to notice subtle mismatches between what they’re doing and what makes sense to do, and subtle adjustments to specific projects, to which projects make sense at all, and subtle updates from how the work is going that can be propagated to some larger set of things, etc. We need people who care and don’t just want to signal that they kinda look like they care. We need people who become smarter and wiser and more oriented over time and who have deep scientific aesthetics, and other aesthetics. We need people who can go for what matters even when it means backtracking or losing face. We don’t mainly need people as something like fully needing-to-be-directed subjugated labor, who try for the appearances while lacking an internal compass. I expect more of this from folks who average 35 hrs/​week than 60 hrs/​week in most cases (not counting brief sprints, trying things for awhile to test and stretch one’s capacities, etc. — all of which seems healthy and part of fully inhabiting this world to me). Basically because of the things pointed out in Raemon’s post about slack, or Ben’s post about the Sabbath. Also because often 60 hrs/​week for long periods of time means unconsciously writing off important personal goals (cf Critch’s post about addiction to work), and IMO writing off deep goals for the long-term makes it hard to sincerely care about things. (4. I do agree there’s something useful about being able to work on other peoples’ projects, or on mundane non-glamorous projects, that many don’t have, and that naive readings of my #3 might tend to pull away from. I think the deeper readings of #3 don’t, but it could be discussed.) ### If I instead try to share my actual views, despite these being kinda wooey and inarticulate and hard to justify, instead of trying to reason about people as means to an end: A. I still agree that in a world where all that would/​could impact AI safety was a particular engineering project (e.g., Redwood’s ML experiments, for concreteness), and if the time-frame of a person’s relevance to that effort was relatively short (e.g., a year or two, or even probably even five years), people focusing obsessively for 60 hours/​week would be in many ways saner-feeling, more grounding, and more likely to produce the right kind of work in the right timeframe than the same people capping their work at 35 hrs/​week. (Although even here, vacations, sabbaths, or otherwise carefully maintaining enough of the right kinds of slack and leisure that deep new things can bubble up seems really valuable to me; otherwise I expect a lot of people working hard at dumb subtasks). A2. I’m talking about “saner-feeling” and “more grounding” here, because I’m imagining that if people are somehow capping their work at 35 hrs/​week, this might be via dissociating from how things matter, and dissociation sucks and has bad side-effects on the quality of work and of team conversation and such IMO. This is really the main thing I’m optimizing for ~in general; I think sane grounded contexts where people can see what causes will have what effects and can acknowledge what matters will mostly cause a lot of the right actions, and that the main question is how to cause such contexts, whether that means 60 hrs/​week or 35 hrs/​week or what. A3. In this alternate world, I expect people will kinda naturally reason about themselves and one another as means to an end (to the end of us all surviving), in a way that won’t be disoriented and won’t be made out of fear and belief-in-belief and weird dissociation. B. In the world we seem to actually be in, I think all of this is pretty different: B1. It’s hard to know what safety strategies will or won’t help how much. B2. Lots of people have “belief in belief” about safety strategies working. Often this is partly politically motivated/​manipulated, e.g. people wanting to work at an organization and to rise there via buying into that organization’s narrative; an organization wanting its staff and potential hires to buy its narrative so they’ll work hard and organize their work in particular ways and be loyal. B3. There are large “unknown unknowns,” large gaps in the total set of strategies being done, maybe none of this makes sense, etc. B4. AI timelines are probably more than two years, probably also more than five years, although it’s hard to know. C. In a context like the hypothetical one in A, people talking about how some people are worth much more than another, about what tradeoffs will have what effects, etc. will for many cash out in mechanistic reasoning and so be basically sane-making and grounding. (Likewise, I suspect battlefield triage or mechanistic reasoning from a group of firefighters considering rescuing people from a burning building is pretty sane-making.) In a context like the one in B (which is the one I think we’re in), people talking about themselves and other people as mere means to an end, and about how much more some people are worth than another such that those other people are a waste for the first people to talk to, and so on, will tend to increase social fear, decrease sharing of actual views, and increase weird status stuff and the feeling that one ought not question current social narratives, I think. It will tend to erode trust, erode freedom to be oneself or to share data about how one is actually thinking and feeling, and increase the extent to which people cut off their own and others’ perceptual faculties. The opposite of sane-ifying/​grounding. To gesture a bit at what I mean: a friend of mine, after attending a gathering of EA elites for the first time, complained that it was like: “So, which of the 30 organizations that we all agree has no more than a 0.1% chance of saving the world do you work for?”, followed by talking shop about the specifics within that plan, with almost no attention to the rest of the probability mass. So I think we ought mostly not to reason about ourselves and other “labor” as though we’re in simple microecon world, given the world we’re in, and given that it encourages writing off a bunch of peoples’ perceptual abilities etc. Though I also think that you, Buck (or others) speaking your mind, including when you’re reasoning this way, is extremely helpful! We of course can’t stop wrong views by taking my best guess at which views are right and doing belief-in-belief about it; we have to converse freely and see what comes out. (Thanks to Justis for saying some of this to me in the comments prior to me posting.) • Thanks for all these comments. I agree with a bunch of this. I might try later to explain more precisely where I agree and disagree. • Taking on a 60-hour/​week job to see if you burn out seems unwise to me. Some better plans: • Try lots of jobs on lots of teams, to see if there is a job you can work 60 hours/​week at. • Pay attention to what features of your job are energizing vs. costly. Notice any bad habits that might cause burnout. • Become more productive per hour. • Insight volume/​quality doesn’t seem meaningfully correlated with hours worked (see June Huh for an extreme example), high-insight people tend to have work schedules optimized for their mental comfort. I don’t think encouraging someone who’s producing insights at 35 hours per week to work 60 hours per week is positive will result in more alignment progress, and I also doubt that the only people producing insight are those working 60 hours per week. EDIT: this of course relies on the prior belief that more insights are what we need for alignment right now. • 23 Jul 2022 6:11 UTC 23 points 3 ∶ 0 Much of what we do in daily life – especially when we’re feeling “free” and “unstuck,” and as though there is nothing in particular that we “have to” do – has the effect of making us well-resourced and capable in general. For example, by default, a lot of us would spend a lot of time reading interesting books of varied sorts, nerding out about interesting topics, trying our hands at new skills and crafts, etc. Also, we often like making our living spaces nicer (and more functional), forming new friendships, and so on. Hm, much of what we do in daily life has the effect of making us locally happy and increase our local status. Spending time with friends, hosting events for our social group, texting with friends, watching youtube. ...in “emergencies” where you do not have an accurate map of how to solve the emergency, your behavior should probably be more like normal than in better-mapped emergencies. This section presumes two-buckets for how we spend time, “general resource maximizing” and “narrow resource maximizing”. An alternative model here is one big bucket called “fucking around like you’re a monkey” and then two much smaller buckets called “general resource maximizing” and “narrow resource maximizing”. • Yeah, this might be the comment I would have written if I were better at articulating my thoughts. • An alternative model here is one big bucket called “fucking around like you’re a monkey” and then two much smaller buckets called “general resource maximizing” and “narrow resource maximizing”. Just want to mention that this was a very funny comment and I laughed at it. • 23 Jul 2022 20:01 UTC 17 points 2 ∶ 1 Curated. I think this is a post that is beneficial for many people to hear right now. In the last six months, there has been a shift of beliefs across people both that powerful AI stuff is going to happen soon and that very probably it will be bad. This seems to have been a wake-up call for many, and many of those feel that they suddenly have to do something, now. And it is good to do things, but we have to stay sane. I like this post for pushing in that direction. I think goes well with a post I made on the topic a while back. • 23 Jul 2022 6:18 UTC 17 points 3 ∶ 0 IMO, I see burnout (people working hard at the expense of their long-term resources and capacities) more often than I expect is optimal for helping with AI risk. I’m not sure why. FTR when I currently think of people burning out, the reasons are due to poor working conditions (lots of pressure and expectations, little power or management or resources, etc) or because people were acting on a promise of support and community that they later found wasn’t there. • I like this observation. As a random note, I’ve sometimes heard people justifying “leave poor working conditions in place for others, rather than spending managerial time improving them” based on how AI risk is an emergency, though whether this checks out on a local consequentialist level is not actually analyzed by the model above, since it partly involves tradeoffs between people and I didn’t try to get into that. I sorta also think that “people acting on a promise of community and support that they later [find] [isn’t] there” is sometimes done semi-deliberately by the individuals in question, who are trying to get as much work out of their system one’s as possible, by hoping a thing works out without really desiring accurate answers. Or by others who value getting particular work done (via those individuals working hard) and think things are urgent and so are reasoning short-term and locally consequentialist-ly. Again partly because people are reasoning near an “emergency.” But this claim seems harder to check/​verify. I hope people put more time into “really generating community” rather than “causing newcomers to have an expectation of community,” though. • I sorta also think that “people acting on a promise of community and support that they later [find] [isn’t] there” is sometimes done semi-deliberately by the individuals in question, who are trying to get as much work out of their system one’s as possible, by hoping a thing works out without really desiring accurate answers. Personally I think of people as more acting out their dream, because reality seems empty. Like the cargo culters, praying to a magical skyplane that will never arrive. Sure, you can argue to them that they’re wasting their time. But they don’t have any other idea about how to get skyplanes, and the world is a lot less… magical without them. So they keep manning their towers and waving their lights. • 18 Jul 2022 4:02 UTC 17 points 2 ∶ 0 Some other posts that feel related (not sure you should cite them here, I might just reply with comments once the post is up) are the “Sabbath as Alarm” section of Ben Hoffman’s Sabbath Hard and Go Home. One more useful attribute of the Jewish Sabbath is the extent to which its rigid rules generate friction in emergency situations. If your community center is not within walking distance, if there is not enough slack in your schedule to prep things a day in advance, or you are too poor to go a day without work, or too locally isolated to last a day without broadcast entertainment, then things are not okay. In our commercialized society, there will be many opportunities to purchase palliatives, and these palliatives are often worth purchasing. If living close to your place of employment would be ruinously expensive, you drive or take public transit. If you don’t have time to feed yourself, you can buy some fast food. If you’re not up for talking with a friend in person, or don’t have the time, there’s Facebook. But this is palliative care for a chronic problem. In Jewish law, it is permissible to break the Sabbath in an emergency situation, when lives are at stake. If something like the Orthodox Sabbath seems impossibly hard, or if you try to keep it but end up breaking it every week—as my Reform Jewish family did—then you should consider that perhaps, despite the propaganda of the palliatives, you are in a permanent state of emergency. This is not okay. You are not doing okay. So, how are you? And maybe also his comment on this post. Something about the tone of this post seems like it’s missing an important distinction. Targeted alarm is for finding the occasional, rare bad actor. As Romeo pointed out in his comment, we suffer from alarm fatigue. The kind of alarm that needs raising for self-propagating patterns of motivated reasoning is procedural or conceptual. People are mistakenly behaving (in some contexts) as though certain information sources were reliable. This is often part of a compartmentalized pattern; in other contexts, the same people act as though, not only do they personally know, but everybody knows, that those sources are not trustworthy. • As someone currently practicing Orthodox Judaism (though I am coming at it from an agnostic perspective, which is somewhat unusual), I find that Shabbos is often the most “productive” day of the week for me, even though I’m not online and can’t write anything down. This Saturday I ended up reading a random paperback book that gave me an idea for a potentially important neglected cause area, for instance (will probably post more details later). It definitely gives some perspective on how much of an emergency most perceived “emergencies” are, that’s for sure. • Just out of curiosity, how much of the burnout you mention is because of: 1. Working too hard, focusing too much on narrow plans and sacrificing other areas of your life Versus: 2. A world where you suddenly see high AI x-risk and s-risk, and nobody working on it, is just a fairly depressing world if you haven’t adequately calibrated to it. Your post mostly aims at 1, but I wonder how much of it is 2. Calibration is hard imo. People can (and should celebrate) even 0.1% changes in x-risk, it’s disorienting to suddenly update your whole world model from like 1% x-risk to 40%. Random suggestion: A way to test whether a person is facing 1 or 2 might be for them to take a short break from work. If it’s 1 they may be more likely to feel better* than if it’s 2. I wonder if that works, and would be keen on people’s thoughts! *(Assuming ofcourse, that they accept the reasoning that a short break is good for them, and they don’t stress about the utilons they may feel theyre sacrificing while they’re away.) • If you have an emergency where you don’t know what to do about it and you have time pressure, then you might have to get better at philosophy because philosophy is what clarifies channels of influence and opens up new channels of influence. Getting back to “concrete” things is good for many reasons, but versions of that which give up on philosophy are throwing out the most likely hope; the philosophy that happens by itself might be too slow. Like how sometimes in software projects, roughly no amount of programming energy will help because you have the wrong theory of the program, and everything you currently know how to do is the equivalent of writing a bunch of special case rules like “5+8=13″ and “(\d)+(\d*)0 =2\$1”.

• Justis and Ruby made a bunch of good substantive comments on my draft, and also Justis made a bunch of very helpful typo-fixes/​copy-editing comments on my draft.

I fixed the copy-editing ones but mostly did not respond to the substantive ones, though I liked them; I am hoping some of that discussion makes its way here, where it can happen in public.

• We are now already in the midst of an AI emergency that has not fully emerged yet even though it is not the AGI/​ASI emergency yet. The large language models that have been released into the wilds put power into the hands of humans that use them that has never been trivially accessible to this large a segment of the population before.

If I tell a language model that doing crimes is a good thing as part of its context then it will happily help me design a raid on a bank or figure out how to construct a large scale chemical weapons capacity and those are just cases of trivially scratching the surface of what’s possible without getting into using the models to enhance the nitty gritty aspects of life.

The advent of these models (a horse that can no longer be put back in a barn now that there are multiple of them out in the wilds) is going to be a bigger thing than the invention and deployment of the internet because rather than democratizing information the way the internet did, this democratizes raw intellectual capacity and knowledge processing. Or at least it would be if it had time to fully mature and deploy prior to the even bigger tsunami that’s right behind it.

What we didn’t account for with all the worries about AGI/​ASIs was the idea that we could skip right to the ASI part by having humans as the generalists at the bottom of an ANI pyramid.

The ship has already become unmoored and the storm is already here. It’s not as stormy as it’s going to be later, but from here on out, things only get wilder at an exponential scale without ever letting up.

There is now a relatively clear path from where we are to an AGI/​ASI outcome and anyone tapping the breaks is only predetermining that they personally won’t be the ones to determine the nature of the outcome rather than having an impact on the arrival of the outcome itself.

• To what extent are people burning themselves out, vs using what they’re doing as an excuse not to perform effortful and sometimes unpleasant mental and physical hygiene? My understanding is this is a crowd prone to depression anyway, and failing to engage in self-care is a pretty common. IE—if these people were working on something else, would we expect them to burn long-term resources anyway?

• A bunch of people have told me they got worse at having serious/​effortful intellectual hobbies, and at “hanging out”, after getting worried about AI. I did, for many long years. Doesn’t mean it’s not an “excuse”; I agree it would be good to try to get detailed pictures of the causal structure if we can.

• The problem is basically our machinery for emergencies are entirely broken. More specifically, our emergency scale is now years or decades into a 10x-100x industrial revolution, while our fight or flight reactions are at best only useful for days scale emergencies like stress.

• That’s not really a problem. It’s a parameter. What this means is that you can’t functionally use the fight-or-flight system to orient to long-term emergencies, except on the (rare) occasions when you in fact see some pivotal thing on the timescale of seconds or minutes that actually makes a big difference.

…which, corollary, means that if you’re freaking out about something on timescales longer than that, you haven’t really integrated an understanding of the whole context deeply enough, and/​or you’re projecting an immediate threat on a situation that probably isn’t actually threatening (i.e., trauma triggers).

• I was talking about how we naturally deal with an emergency, and why our evolved ways to deal with it are so off distribution that it entirely harms us. The basic problem is that in the EEA, the longest emergencies lasted were days; you were dead or you more or less completely recovered by then, and often emergencies resolved in minutes or seconds.

Now we have to deal with an abstract emergency that is only doable by experimenting, because the abstract, first principles way is entirely evolved for tribalism, not truth, and given the likely takeoff is slow, years-long emergencies have to be dealt with continuously in AI X-risk.

This is entirely off-distribution for our evolved machinery for emergencies, so that’s why I made that comment.

• Yep, I think I understood. I thought your comment made sense and was worth saying.

I think my phrasing came across funny over text. Reading it back, it sounds more dismissive than I meant it to be.

I do suspect we disagree about a subtle point. I don’t think our evolved toolkit works against us here. The problem (as I see it) is upstream of that. We’re trying to treat things that last for more than days as “emergencies”, thus inappropriately applying the evolved emergency toolkit in situations it doesn’t work well for.

I mean, if I want to visit a friend who’s half a mile away, I might just walk. If I want to visit a friend who’s across the country, I’m not even tempted to think the answer is to start walking in their direction. This is a case where understanding the full context means that my evolved instincts (in this case for traveling everywhere via walking) help instead of creating a problem: I walk to my computer to buy a plane ticket, then to my car to drive to the airport, etc.

We haven’t worked out the same cultural machinery for long timescale emergencies yet.

And you’re quite right, given this situation our instincts are really terribly set up for handling it.

But that’s what psychotechnologies are for. Things like language and mathematics help to extend our instinctual basis so we can work with things way, way beyond the scope of what our evolved capacities ever had to handle.

We just haven’t developed adequate psychotech here just yet.

(And in this particular case I think current unaligned superintelligences want us to stay scared and confused, so inventing that psychotech is currently an adversarial process — but that’s not a crux for my point here.)

• Yeah, that’s a crux for me. Essentially, evolution suffered an extremal Goodhart problem, where taking the naturally evolved mechanisms for emergencies out of it’s EEA distribution leads to weird and bad outcomes.

My point is genetics and evolution matter a lot, much more than a lot of self-help and blank slate views tend to give, which is why I give primacy to genetic issues. So psychotechnologies and new moral systems are facing up against a very powerful optimizer, genes and evolution and usually the latter wins.

• The basic problem is that in the EEA, the longest emergencies lasted were days;

Is that actually the case? Something like a famine seems like it could last longer and be easily caused by unfavorable seasonal conditions. More social kinds of emergencies like “a competing faction within your tribe starting to gain power” also seem like they would be more long-lasting. Also e.g. if a significant part of your tribesmen happened to get sick or badly wounded around the same time.

• I’ll somewhat concede here that such things can be dealt with. On the tribal example, well our big brains are basically geared to tribal politics for this exact reason.

• Hegemony How-To by Johnathan Smucker talks about ‘hardcore’ as something people want to do in activist movements, and the need to channel this into something productive. Some people want to work hard, and make sacrifices for something they believe in, and do not like being told ‘take care of yourself, work like 30 good hours a week, and try to be nice to people’.

This happens in all activist movements, and in my opinion, can happen anywhere where intrinsic motivation rather than extrinsic motivation is the main driver, and that the more a leader makes appeals for ‘emotional motivation’ rather than offering say, money, the more likely a few ‘hardcores’ emerge.

I’d say this is a risk in AI safety, it’s not too profitable to join, people who are really active usually feel really strongly, and status is earned by perceived contribution. So of course some people will want to ‘go hardcore for AI safety’.

Based on some of the scandals in EA/​rationalist communities, I wouldn’t be surprised if ‘hardcore’ has been channeled into ‘sex stuff with someone in a position of perceived authority’, which I’d guess is probably actively harmful, or in the absolute best case, totally unproductive.

Tldr, to use a dog training analogy, a ‘working dog’ that isn’t put to work will find something to do, and you probably won’t like it.

• Reminds me of this:

A long-time environmental activist was speaking to an enthusiastic group of young environmentalists at a rally. He warned of the precarious situation the environment was in, the toll that corporate greed had taken on forests, and the dire consequences that lay ahead if serious changes were not made. He then shouted out to the crowd,

“Are you ready to get out there and fight for the environment?” To which they answered an enthusiastic,

“Yeah!”

“Are you ready to get arrested and go to jail for the environment?”

“Yeah!!”

“Are you ready to give your life for the environment?”

“Yeah!!!”

“Are you willing to cut your hair and put on a suit for the environment?”

The crowd fell silent.

Whether this is a true story or just a colorful fable, the lesson is one we should all take to heart. How we look and dress is intimately tied up with our self-identity. How we look and dress also has a significant impact on how persuasive we will be and therefore how effective we will be at creating change. Abandoning an aspect of self-identity in order to be more effective at protecting the environment (or animals or people) can be a lot harder than it seems for those who’ve never had to make such a decision.

(from Nick Cooney’s Change of Heart: What Psychology Can Teach Us About Spreading Social Change)

• In fairness, a lot of these things (clothes, hairstyles, how “hard core” we can think we are based on working hours and such) have effects on our future self-image, and on any future actions that’re mediated by our future self-image. Maybe they’re protecting their psyches from getting eaten by corporate memes, by refusing to cut their hair and go work there.

I suspect we need to somehow have things less based in self-image if we are to do things that’re rooted in fresh perceptions etc. in the way e.g. science needs, but it’s a terrifying transition.

• I wrote:
>… Also, I am excited about people trying to follow paths to all of their long-term goals/​flourishing, including their romantic and reproductive goals, and I am actively not excited about people deciding to shelve that because they think AI risk demands it.

Justis (amid a bunch of useful copy-editing comments) said he does not know what “actively not excited” is supposed to mean, and suggested that maybe I meant “worried.” I do not mean “worried”, and do mean “actively not excited”: when people do this, it makes me less excited by and hopeful about the AI risk movement; it makes me think we’re eating our seedcorn and have less here to be excited about.

• A friend emailed me a comment I found helpful, which I am copying here with their permission:

“To me [your post] sounded a bit like a lot of people are experiencing symptoms similar to ADHD: both becoming hyperfocused on a specific thing and having a lot of habits falling apart. Makes sense conceptually if things labeled as emergencies damage our attention systems. I think it might have to do with a more general state of stress/​future-shock where people have to go into exception-handling mode more often. As exceptions become normalized the systems of normal information processing fall apart and people become unable to orient. Anyway, a lot of this is probably really familiar stuff to you, but the ADHD-framing seems new to me, and it’s plausibly helpful to treat it with some similar interventions (eg things like exercise and robust systems/​habits)”

• 18 Jul 2022 5:16 UTC
6 points
3 ∶ 1

Totally agree with everything said here. Also want to add that if you’re really having trouble with hyperfocus/​burnout to the point it’s negatively impacting you, it may be worth speaking with a good therapist/​psychiatrist if you can afford to do so. The system is far from perfect and doesn’t work well for everyone, but it’s absolutely a worthwhile investment of time/​energy imo.

• Thanks for writing this. I found it useful and have shared it with others.

I’ve also heard a number of people tell me that EA or AI safety efforts caused them to lose the ability to have serious hobbies, or serious intellectual interests, and I would guess this was harmful to long-term AI safety potential in most cases.

If you’d be up for sharing, I’d be pretty interested in a rough estimate of how many specific people you know of who have had this experience (and maybe also how many were “people who (IMO) have a better shot and a better plan than most for reducing AI risk.”)

To be clear, I totally buy that this happens and there’s a problem here. I just find it useful to get some sense of the prevalence of this kind of thing.

• I can think of five easily who spontaneously said something like this to me and who I recall specific names and details about. And like 20 more who I’m inclined to project it onto but there was some guesswork involved on my part (e.g., they told me about trouble having hobbies and about feeling kinda haunted by whether it’s okay to be “wasting” their time, and it seemed to me these factors were connected, but they didn’t connect them aloud for me; or I said I thought there was a pattern like this and they nodded and discussed experiences of theirs but in a way that left some construal to me and might’ve been primed. Also I did not name the 20, might be wrong in my notion of how many).

In terms of the five: two “yes better shot IMO,” three not. For the 20, maybe 1/​4th “better short IMO”.

• In terms of time-scales, I am pretty ignorant, but I personally will not be too surprised if the highest risk period is in only a couple years, nor if it is in more than thirty years.

That sounds like a good estimate of the uncertainty, but is it communicated well to those who decide to drop everything and work on AI safety?

• I agree that many of those who decide to drop everything to work on AI expect AI sooner than that. (Though far from all.)

It seems to me though that even if AI is in fact coming fairly soon, e.g. in 5 years, this is probably still not-helpful for reducing AI risk in most cases, compared to continuing to have hobbies and to not eat one’s long-term deep interests and spiritual health and ability to make new sense of things.

Am I missing what you’re saying?

• I agree that the time frame of 5-30 years is more like a marathon than a sprint, but those you are talking about treat it like a sprint. It would make sense if there was a clear low-uncertainly estimate of “we have to finish in 5 years, and we have 10 years worth of work to do” to better get cracking, everything else is on hold. But it seems like a more realistic estimate is “the TAI timeline is between a few years and a few decades, and we have no clue how much work AI Safety entails, or if it is even an achievable goal. Worse, we cannot even estimate the effort required to figure out if the goal is achievable, or even meaningful.” In this latter case, it’s a marathon on unknown length, and one has to pace themselves. I wonder if this message is intentionally minimized to keep the sense of urgency going.

• This post and the comments are a very interesting read. There is one thing that I find confusing, however. My impression is that in the text and the comments, children are only discussed as means to fulfilling the parents’ “reproductive goals” and traded off against the opportunity cost of saving humanity (though it is also discussed that this dichotomy is false because by saving humanity you also save your children). Probably I am overlooking something, but what I don’t see is a mention whether the expectations of AI timelines, to the extent that you cannot influence them, affect (or should affect) peoples’ decisions of having children. A relevant number of people seem to expect AGI in something like 8 years and low probability of alignment. I am a bit confused about the “animal” arguments, but it sounds a bit like saying “Okay even if you believe the world will end in 8 years, but you are in the age span where your hormones tell you that you want children, you should do that”. As somehow who is just an interested (and worried) reader with regards to this topic, I wonder whether people in AI alignment just postpone or give up on having children because they expect disaster.