Kaarel comments on kh’s Shortform

Kaarel 1 Feb 2026 9:50 UTC
11 points
2
When fooming, uphold the option to live in an AGI-free world.

There are people who think (imo correctly) that there will be at least one vastly superhuman AI in the next 100 years by default and (imo incorrectly) that proceeding along the AI path does not lead to human extinction or disempowerment by default. My anecdotal impression is that a significant fraction (maybe most) of such people think (imo incorrectly) that letting Anthropic/Claude do recursive self-improvement and be a forever-sovereign would probably go really well for humanity. The point of this note is to make the following proposal and request: if you ever let an AI self-improve, or more generally if you have AIs creating successor AIs, or even more generally if you let the AI world develop and outpace humans in some other way, or if you try to run some process where boxed AIs are supposed to create an initial ASI sovereign, or if you try to have AIs “solve alignment” ^[1] (in one of the ways already listed, or in some other way), or if you are an AI (or human mind upload) involved in some such scheme, ^[2] try to make it so the following property is upheld:
- It should be possible for each current human to decide to go live in an AGI-free world. In more detail:
  - There is to be (let’s say) a galaxy such that AGI is to be banned in this galaxy forever, except for AGI which does some minimal stuff sufficient to enforce this AI ban. ^[3]
  - There should somehow be no way for anything from the rest of the universe to affect what happens in this galaxy. In particular, there should probably not be any way for people in this galaxy to observe what happened elsewhere.
  - If a person chooses to move to this galaxy, they should wake up on a planet that is as much like pre-AGI Earth as possible given the constraints that AGI is banned and that probably many people are missing (because they didn’t choose to move to this galaxy). Some setup should be found which makes institutions as close as possible to current institutions still as functional as possible in this world despite most people who used to play roles in them potentially being missing.
  - For example, it might be possible to set this up by having the galaxy be far enough from all other intelligent activity that because of the expansion of the universe, no outside intelligent activity could be seen from this galaxy. In that case, the humans who choose to go live there would maybe be in cryosleep for a long time, and the formation of this galaxy could be started at an appropriate future time.
  - One should try to minimize the influence of any existing AGI on a human’s thinking before they are asked if they want to make this decision. Obviously, manipulation is very much not allowed. If some manipulation has already happened, it should probably be reversed as much as possible. Ideally, one would ask the version of the person from the world before AGI.
- Here are some further clarifications about the setup:
  - Of course, if the resources in this galaxy are used in a way considered highly wasteful by the rest of the universe, then nothing is to be done about that by the rest of the universe.
  - If the people in this galaxy are about to kill themselves (e.g. with engineered pathogens), then nothing is to be done about that. (Of course, except that: the AI ban is supposed to make it so they don’t kill themselves with AI.)
  - Yes, if the humanity in this galaxy becomes fascist or runs a vast torturing operation (like some consider factory farming to be), then nothing is to be done about that either.
  - We might want to decide more precisely what we mean by the world being “AGI-free”. Is it fine to slowly augment humans more and more with novel technological components, until the technological components are eventually together doing more of the thinking-work than currently existing human thinking-components? Is it fine to make a human mind upload?
    I think I would prefer a world in which it is possible for humans to grow vastly more intelligent than we are now, if we do it extremely slowly+carefully+thoughtfully. It seems difficult/impossible to concretely spell out an AI ban ahead of time that allows this. But maybe it’s fine to keep this not spelled out — maybe it’s fine to just say something like what I’ve just said. After all, the AI banning AI for us will have to make very many subtle interpretation decisions well in any case.
- We can consider alternative requests. Here are some parameters that could be changed:
  - Instead of AI being banned in this galaxy forever, AI could be banned only for 1000 or 100 years.
    Maybe you’d want this because you would want to remain open to eventual AI involvement in human life/history, just if this comes after a lot more thinking and at a time when humanity’s is better able to make decisions thoughtfully.
    Another reason to like this variant is that it alleviates the problem with precisifying what “ban AI” means — now one can try to spell this out in a way that “only” has to continue making sense over 100 or 1000 years of development.
  - Instead of giving each human the option to move to this galaxy, you could give each human the option to branch into two copies, with one moving to this galaxy and one staying in the AI-affected world.
  - The total amount of resources in this AI-free world should maybe scale with the number of people that decide to move there. Naively, there should be order $10$ reachable galaxies per person alive, so the main proposal which just allocates a single galaxy to all the people who make this decision asks for much less than what an even division heuristic suggests.
  - We could ask AIs to do some limited stuff in this galaxy (in addition to banning AGI).
    Some example requests:
    We might say that AIs are also allowed to make it so death is largely optional for each person. This could look like unwanted death being prevented, or it could look like you getting revived in this galaxy after you “die”, or it could look like you “going to heaven” (ie getting revived in some non-interacting other place).
    We might ask for some starting pack of new technologies.
    We might ask for some starting pack of understanding, e.g. for textbooks providing better scientific and mathematical understanding and teaching us to create various technologies.
    We might say that AIs are supposed to act as wardens to some sort of democratic system. (Hmm, but what should be done if the people in this galaxy want to change that system?)
    We might ask AIs to maintain some system humans in this galaxy can use to jointly request new services from the AIs.
    
    However, letting AIs do a lot of stuff is scary — it’s scary to depart from how human life would unfold without AI influence. Each of the things in the list just provided would constitute/cause a big change to human life. Before/when we change something major, we should take time to ponder how our life is supposed to proceed in the new context (and what version of the change is to be made (if any)), so we don’t [lose ourselves]/[break our valuing].
  - There could be many different AI-free galaxies with various different parameter settings, with each person getting to choose which one(s) to live in. At some point this runs into a resource limit, but it could be fine to ask that each person minimally gets to design the initial state of one galaxy and send their friends and others invites to have a clone come live in it.
- Here are some remarks about the feasibility and naturality of this scheme:
  - If you think letting Anthropic/Claude RSI would be really great, you should probably think that you could do an RSI with this property.
    In fact, in an RSI process which is going well, I think it is close to necessary that something like this property is upheld. Like, if an RSI process would not lead to each current person ^[4] being assigned at least (say) $\frac{1}{1000 \times [the 2025 world population]}$ of all accessible resources, then I think that roughly speaking constitutes a way in which the RSI process has massively failed. And, if each person gets to really decide how to use at least $\frac{1}{1000 \times [the 2025 world population]}$ of all accessible resources ^[5] , then even a group of $100$ people should be able to decide to go live in their own AGI-free galaxy.
    I guess one could disagree with upholding something like this property being feasible conditional on a good foom being feasible or pretty much necessary for a good foom.
    One could think that it’s basically fine to replace humans with random other intelligent beings (e.g. Jürgen Schmidhuber and Richard Sutton seem to think something like this), especially if these beings are “happy” or if their “preferences” are satisfied (e.g. Matthew Barnett seems to think this). One could be somewhat more attached to something human in particular, but still think that it’s basically fine to have no deep respect for existing humans and make some new humans living really happy lives or something (e.g. some utilitarians think this). One could even think that good reflection from a human starting point leads to thinking this. I think this is all tragically wrong. I’m not going to argue against it here though.
    Maybe you could think the proposal would actually be extremely costly to implement for the rest of the universe, because it’s somehow really costly to make everyone else keep their hands off this galaxy? I think this sort of belief is in a lot of tension with thinking a fooming Anthropic/Claude would be really great (except maybe if you somehow really have the moral views just mentioned).
    similarly: You could think that the proposal doesn’t make sense because the AIs in this galaxy that are supposed to be only enforcing an AI ban will have/develop lots of other interests and then claim most of the resources in this galaxy. I think this is again in a lot of tension with thinking a fooming Anthropic/Claude would be really great.
    
    One could say that even if it would be good if each person were assigned all these resources, it is weird to call it a “massive failure” if this doesn’t happen, because future history will be a huge mess by default and the thing I’m envisaging has very nearly $0$ probability and it’s weird to call the default outcome a “massive failure”. My response is that while I agree this good thing has on my inside view $< 1 %$ probability of happening (because AGI researchers/companies will create random AI aliens who disempower humanity) and I also agree future history will be a mess, I think we probably have at least the following genuinely live path to making this happen: we ban AI (and keep making the implementation of the ban stronger as needed), figure out how to make development more thoughtful so we aren’t killing ourselves in other ways either, grow ever more superintelligent together, and basically maintain the property that each person alive now (who gets cryofrozen) controls more than $\frac{1}{1000 \times [the 2025 world population]}$ of all accessible resources ^[6] . ^[7]
    This isn’t an exhaustive list of reasons to think it [is not pretty much necessary to uphold this property for a foom to be good] or [is not feasible to have a foom with this property despite it being feasible to have a good foom]. Maybe there are some reasonable other reasons to disagree?
  - This property being upheld is also sufficient for the future to be like at least meaningfully good. Like, humans would minimally be able to decide to continue human life and development in this other galaxy, and that’s at least meaningfully good, and under certain imo-non-crazy assumptions really quite good (specifically, if: humanity doesn’t commit suicide for a long time if AI is banned AND many different humane developmental paths are ex ante fine AND utility scales quite sublinearly in the amount of resources).
  - So, this property being upheld is arguably necessary for the future to be really good, and it is sufficient for the future to be at least meaningfully good.
  - Also, it is natural to request that whoever is subjecting everyone to the end of the human era preserve the option for each person to continue their AI-free life.
- Here are some reasons why we should adopt this goal, i.e. consider each person to have the right to live in an AGI-free world:
  - Most importantly: I think it helps you think better about whether an RSI process would go well if you are actually tracking that the fooming AI will have to do some concrete big difficult extremely long-term precise humane thing, across more than trillions of years of development. It helps you remember that it is grossly insufficient to just have your AI behave nicely in familiar circumstances and to write nice essays on ethical questions. There’s a massive storm that a fragile humane thing has to weather forever. ^[8] The main reason I want you to keep these things in mind and so think better about development processes is this: I think there is roughly a logical fact that running an RSI process will not lead to the fragile humane thing happening ^[9] , and I think you might be able to see this if you think more concretely and seriously about this question.
  - Adopting this goal entails a rejection of all-consuming forms of successionism. Like, we are saying: No, it is not fine if humans get replaced by random other guys! Not even if these random other guys are smarter than us! Not even if they are faster at burning through negentropy! Not even if there is a lot of preference-satisfaction going on! Not even if they are “sentient”! I think it would be good for all actors relevant to AI to explicitly say and track that they strongly intend to steer away from this.
    That said, I think we should in principle remain open to us humans reflecting together and coming to the conclusion that this sort of thing is right and then turning the universe into a vast number of tiny structures whose preferences are really satisfied or who are feeling a lot of “happiness” or whatever. But provisionally, we should think: if it ever looks like a choice would lead to the future not having a lot of room for anything human, then that choice is probably catastrophically bad.
  - Adopting this also entails a rejection of more humane forms of utilitarianism that however still see humans only as cows to be milked for utility. Like, no, it is not fine if the actual current humans get killed and replaced. Not even if you create a lot of “really cool” human-like beings and have them experiencing bliss! Not even if you create a lot of new humans and have them have a bunch of fun! Not even if you create a lot of new humans from basically the 2025 distribution and give them space to live happy free lives! In general, I want us to think something like:
    There is an entire particular infinite universe of {projects, activities, traditions, ways of thinking, virtues, attitudes, principles, goals, decisions} waiting to grow out of each existing person. ^[10] Each of these moral universes should be treated with a lot of respect, and with much more respect than the hypothetical moral universes that could grow out of other merely possible humans. Each of us would want to be respected in this way, and we should make a pact to respect each other this way, and we should seek to create a world in which we are enduringly respected this way.
  - These reasons apply mostly if, when thinking of RSI-ing, you are not already tracking that this fooming AI will have to do some concrete big difficult extremely long-term precise thing that deeply respects existing humans. If you are already tracking something like this, then plausibly you shouldn’t also track the property I’m suggesting.
    E.g., it would be fine to instead track that each person should get a galaxy they can do whatever they want with/in. I guess I’m saying the AI-free part because it is natural to want something like that in your galaxy (so you get to live your own life in a way properly determined by you, without the immediate massive context shift coming from the presence of even “benign” highly capable AI, that could easily break everything imo), because it makes sense for many people to coordinate to move to this galaxy together (it’s just better in many mundane ways to have people around, but also your thinking and specifically valuing probably $\approx$ need other people to work as they should), and because it is natural to ask that whoever is subjecting everyone to the end of the human era preserves the option for each person to continue an AI-free life in particular.
- Here are a few criticisms of the suggestion:
  - a criticism: “If you run a foom, this property just isn’t going to be upheld, even if you try to uphold it. And, if you run a foom, then having the goal of upholding this property in mind when you put the foom in motion will not even make much of a relative difference in the probability the foom goes well.”
    my response: I think this. I still suggest that we have this goal in mind, for the reasons given earlier.
  - a criticism: “If we imagine a foom being such that this option is upheld, then probably we should imagine it being such that better options are available to people as well.”
    my response: I probably think this. I still suggest that we have this goal in mind, for the reasons given earlier.
1. ↩︎
  I think this is probably a bad term that should be deprecated
2. ↩︎
  well, at least if the year is $< 2500$ and we’re not dealing with a foom of extremely philosophically competent and careful mind uploads or whatever, firstly, you shouldn’t be running a foom (except for the grand human foom we’re already in). secondly, please think more. thirdly, please try to shut down all other AGI attempts and also your lab and maybe yourself, idk in which order. but fourthly, …
3. ↩︎
  This will plausibly require staying ahead of humanity in capabilities in this galaxy forever, so this will be extremely capable AI. So, when I say the galaxy is AGI-free, I don’t mean that artificial generally intelligent systems are not present in the galaxy. I mean that these AIs are supposed to have no involvement in human life except for enforcing an AI ban.
4. ↩︎
  or like at least “their values”
5. ↩︎
  and assuming we aren’t currently massively overestimating the amount of resources accessible to Earth-originating creatures
6. ↩︎
  or maybe we do some joint control thing about which this is technically false but about which it is still pretty fair to say that each person got more of a say than if they merely controlled $\frac{1}{1000 \times [the 2025 world population]}$ of all the resources
7. ↩︎
  an intuition pump: as an individual human, it seems possible to keep carefully developing for a long time without accidentally killing oneself; we just need to make society have analogues of whatever properties/structures make this possible in an individual human
8. ↩︎
  Btw, a pro tip for weathering the storm of crazymessactivitythoughtdevelopmenthistory: be the (generator of the) storm. I.e., continue acting and thinking and developing as humanity. Also, pulling ourselves up by our own bootstraps is based imo. Wanting to have a mommy AI think for us is pretty cringe imo.
9. ↩︎
  Among currently accessible RSI processes, there is one exception: it is in fact fine to have normal human development continue.
10. ↩︎
  Ok, really humans (should) probably importantly have lives and values together, so it would be more correct to say: there is a particular infinite contribution to human life/valuing waiting to grow out of each person. Or: when a person is lost, an important aspect of God) is lost. But the simpler picture is fine for making my current point.
- Galathir 1 Feb 2026 10:53 UTC
  1 point
  0
  Parent
  I’m imagining humanity fracturing into a million or billion different galaxies depending upon their exact level of desire for interacting with AI. I think the human value of the unity of humanity would be lost.
  I think we need to buffer people from having to interact with AI if they don’t want to. But I value having other humans around. So some thing in between everyone living in their perfect isolation and everyone being dragged kicking and screaming into the future is where I think we should aim.

Kaarel comments on kh’s Shortform

When fooming, uphold the option to live in an AGI-free world.