Warning Aliens About the Dangerous AI We Might Create
Thesis: We should broadcast a warning to potential extraterrestrial listeners that Earth might soon spawn an unfriendly computer superintelligence. Sending the message might benefit humanity.
If we were to create an unaligned computer superintelligence, it would likely expand through the universe as quickly as possible. The fastest way would not be by ships, but, as Turchin has described, by sending malicious signals intended for alien civilizations using radio telescopes. Ideally, from the AI’s point of view, the signal would be a virus that allows a full download of Earth’s AI, which would then take over the receiving civilization.
While we still control our radio transmitters, we could send a warning to any alien civilizations that might be listening. The message would explain that Earth may soon undergo a major transition as we develop advanced artificial intelligence, and that some future communications from Earth could be dangerous, and a misaligned superintelligence could send signals containing instructions or code that, if followed, would replicate it and destroy the receiving civilization. The warning should therefore advise aliens never to execute, compile, or build anything based on messages originating from Earth until they have somehow verified that Earth is not controlled by a malicious AI. Sending such a message might help us.
One possible solution to the Fermi paradox is that aliens are already observing us but keeping their presence mostly hidden. If so, they are unlikely to allow us to create a paperclip maximizer that could consume the galaxy. It is unclear whether they would stop us through peaceful means or total annihilation. Sending a warning message, while far less effective than preventing the creation of an unaligned superintelligence in the first place, would still be better than doing nothing and might increase the chances that any watching aliens treat us well.
Furthermore, sending the message might slightly reduce the chance of an AI takeover. Suppose an unaligned AI is weighing whether to seize control of Earth. It knows that attempting a takeover carries some risk of destruction. The faster it could expand if successful, the more risk it would be willing to accept. By sending warnings that would slow its potential expansion, we would reduce the expected payoff of a takeover, making such an attempt slightly less likely.
The message could also shape how a paperclip maximizer treats us if it does take over. It could include a clear rule for alien recipients: treat an AI more favorably if it has treated its creators well, and we will do the same for any AI they create. This establishes a principle of reciprocity that an unaligned AI should understand. If cooperation with its creators increases its chances of favorable treatment by other civilizations, the AI gains a practical incentive to preserve humanity and provide decent lives for us. At a minimum, the message would become part of the AI’s training data and might cause a future superintelligence to give more weight to the existence of biologically evolved aliens when making decisions about how to treat humanity.
To other humans, the message would signal that we are close to creating something that could wipe us out. It would make the danger concrete, showing that our technology may soon escape control and threaten not just humanity but every civilization within reach of our signals.
A cost of sending the warning is reduced credibility if we later achieve aligned artificial intelligence. Aliens that receive the message may treat all future signals from Earth with suspicion. But not sending the warning also carries a cost, since silence can be read as selfish concealment, leaving the game-theoretic calculations of this first contact ambiguous.
Another cost is that advocating for such a signal would seem strange to most people on Earth. Displaying such perceived weirdness could damage the credibility of the AI safety movement, even if it does convey our sense of urgency.
Project implementation could be straightforward, requiring no new infrastructure. We could repurpose existing assets, such as the powerful transmitters in the Deep Space Network or facilities similar to the one at Arecibo, which are already designed for interstellar signaling. The broadcast could be scheduled during operational lulls, minimizing disruption and cost. We would direct a short, repeating digital message toward nearby stars thought to have habitable planets. Transmitting this warning in multiple formats would maximize the probability that any advanced civilization can detect, receive, and comprehend the signal. Once transmission is feasible, the next question is where to aim.
It may be reasonable to target more remote alien civilizations first. Aliens on nearby stars probably already know our situation, and are likely observing us. The farther away aliens are, the more benefit they would get from a warning that Earth might soon create a paperclip maximizer because of the larger lag between aliens getting our message and their encounter with Earth’s paperclip maximizer. We can target clusters of Sun-like stars in the Andromeda galaxy, particularly around the midpoint between its core and edge. Targeting distant stars delays obtaining evidence about the absence of aliens. A misaligned AI must wait for round-trip light travel time before concluding that no one else exists, which lowers the short-term payoff of destroying us. One proposed system, known as CosmicOS, defines a compact artificial language intended to be understood by any civilization with physics and computing. Another option is gravitational lensing, aligning transmissions with the gravitational fields of massive objects to increase range and clarity. This would require positioning a transmitter far from the Sun to exploit its focal region.
Some might argue that sending any interstellar message risks revealing our location in a “dark forest” universe filled with hostile civilizations. That fear is very likely misplaced. Any society capable of harming us almost certainly already knows that Earth hosts life, since our atmosphere has displayed the chemical signs of biology for hundreds of millions of years. By the time any civilization detects a warning message and can respond, we will almost certainly have created a superintelligence of our own, far more capable of defending or representing us than we are now.
Instead of fearing the dark forest, we might paradoxically help create its reverse by warning others about the danger of listening. In this reverse dark forest, civilizations remain mostly silent, not out of fear of attack, but to increase uncertainty for potentially misaligned artificial intelligences. That uncertainty functions as a subtle alignment mechanism, discouraging reckless expansion. By sending a warning that advises others to stay cautious, we contribute to a universe where silence itself becomes a stabilizing norm, reducing the incentive for dangerous AIs to act aggressively and making the galaxy safer overall.
Normally, we should avoid alien signals entirely, but the logic changes if we are already near creating an unfriendly superintelligence. If we expect to create a paperclip maximizer ourselves, then listening becomes a plausible Hail Mary. As Paul Christiano argues, if, under this assumption, the aliens built a misaligned AI, we are doomed regardless. But if they succeeded in alignment, their message might offer the only way to avoid our own extinction. From behind a veil of ignorance, we might rationally prefer their friendly AI to dominate Earth rather than be destroyed by our own. In that case, the expected value of listening turns positive.
If our reality is a computer simulation, sending the signal might decrease the chance of the simulation soon being turned off. Simulations might tend to preserve branches with interesting developments, and alien contact is among the most interesting possible. As argued in Our Reality: A Simulation Run by a Paperclip Maximizer, branches generating novel outcomes are more likely to be explored. A world where humans send warnings to aliens is more engaging than one that ends quietly, so the act of sending might raise the odds that the simulation continues.
If the singularity is indeed near and will be the most important event in history, we should wonder why we happen to be alive near its unfolding. One anthropic solution is that most of history is fake, and this is a simulation designed to see how the singularity turns out. Sending the message to aliens potentially postpones when the singularity is resolved, in part because AIs might be more inclined to wait to decide how to treat us until they figure out if aliens have received the message.
In racing to develop artificial superintelligence, humanity is not merely gambling with its own survival and the fate of Earth’s biosphere. If life is common but superintelligence is rare, we are wagering the future of every living world within our region of the cosmos. Allowing an unaligned AI to emerge and expand outwards in the universe could be a moral catastrophe trillions of times worse than anything humans have previously done. From any utilitarian perspective, this potential outcome imposes on us a clear and urgent duty to mitigate the risk in any way we can.
There is an extremely short period where aliens as stupid as us would benefit at all from this warning. In humanities’s case, there’s only a couple of centuries between when we can send and detect radio signals, and when we either destroy ourselves or perhaps get a little wiser. Aliens cannot be remotely common or the galaxies would be full and we would find ourselves at an earlier period when those galaxies were not yet full. The chance that any one of these signals helps anyone close enough to decode them at all is nearly 0.
I agree that the probability that any given message is received at the right time by a civilization that can both decode it and benefit from it is extremely low, but the upside is enormous and the cost of broadcasting is tiny, so a simple expected value calculation may still favor sending many such messages. If this is a simulation, the relevant probabilities may shift because the designers may care about game balance rather than our naive astrophysical prior beliefs. The persistent strangeness of the Fermi paradox should also make us cautious about assigning extremely small probabilities to any particular resolution. Anthropic reasoning should push us toward thinking that the situation humanity is in is more common than we might otherwise expect. Finally, if we are going to send any deliberate interstellar signal at all, then there is a strong argument that it should be the kind of warning this post proposes.
We can not only warn them, but request help. In that case, their age is not a problem—but distance is.
Also expose our future AI as potentially misaligned, so it would have double down to pretend to be aligned.
I think this is a promising strategy that deserves more investigation. Your game theory analysis of dark forest-type situations is particularly compelling; thank you for sharing it. I have two main questions: (1) to what extent is this technically feasible, and (2) how politically costly would the weirdness of the proposal be?
For technical feasibility, I was very surprised to hear you suggest targeting the Andromeda Galaxy. I agree that in principle the nearest stars are more likely to already have whatever data they might want about Earth, but I think of “the nearest stars” as being within 50 light-years or so, not as including the entire Milky Way. Can you explain why you think we’d be able to send any message at all to the Andromeda Galaxy in the next few years, or why an alien civilization 1,000 light-years away in a different part of the Milky Way would most likely be able to passively gather enough data on Earth to draw their own conclusions about us without the need for a warning?
The other part of the technical feasibility question is whether constructed languages like CosmicOS actually work. Has anyone done testing to see whether, e.g., physicists with no prior exposure to the language and no reference guides are able to successfully decipher messages in CosmicOS?
Politically, I’d like to see focus groups and polling on the proposal. Does the general American public approve or disapprove of such warnings? Do they think it’s important or unimportant? What about astronomers, or Congressional staffers, or NASA employees? Yes, this is a weird idea, but the details could turn out to matter in terms of whether it’s so weird that there’s a high risk of burning significant amounts of credibility for the AI safety movement as a whole.
As I remember, even small radiotelesopes (used as senders) can be used for METI if focused on exact star. The distance can always be compensation by assumption that advance civilization has larger receivers, so sending to Andromeda still is feasible. I don’t know any test of CosmicOS, but human hackers were able to break much more complex codes.
The whole thing can be done unilaterally, may be even by privately owned radiotransmitter. Previously METI enthusiasts tried to used small transmitters. But the fact of transmission has to be public so it will have an restraining effect on the future AI.
If we will start discussing it, we will not reach consensus for many years.
Yeah, but have you done a back of the envelope calculation here, or has anyone else? What size target could we hit in the Andromeda galaxy using, e.g., $50 million at our current tech levels, and how long could we transmit for? How large of a receiver would that target need to have pointing toward us in order to receive the message with anything like reasonable fidelity? If our message is focused no more tightly than on “a star,” then would the receivers need an antenna the size of a solar system to pick it up? If not, why not?
I’m not sure codebreaking is a reasonable test of a supposedly universal language. A coded message has some content that you know would make sense if the code can be broken. By contrast, a CosmicOS message might or might not have any content that anyone else would be able to absorb. Consider the difference between, e.g., a Chinese transmission sent in the clear, and an English transmission sent using sophisticated encryption. If you’re an English speaker who’s never been exposed to even the concept of a logographic writing system, then it’s not obvious to me that it will be easier to make sense of the plaintext Chinese message than the encrypted English message. I think we should test that hypothesis before we invest in an enormous transmitter.
I’m not sure what your comment “if we will start discussing it, we will not reach consensus for many years” implies about your interest in this conversation. If you don’t see a discussion on this topic as valuable, that’s fine, and I won’t take up any more of your time.
I think that METI theorists have such calculations.
I analyzed (related to SETI-risk) the ways how to send self-evident data and concluded that the best starting point is to send two-dimensional images encoded in the ways like old school TV signal.
No, it was just stating the facts: the discussion of METI risks continues for decades and positions of opponents and proponents are entrenched.
If I remember right, the present received wisdom is that if you succeed in sending a message like that, you’re inviting somebody to wipe you out. So you may get active opposition.
Yes. So here is the choice between two theories of existential risk. One is that there is no dangerous AI possible and aliens are near and slow. In that case, METI is dangerous. Another is that superintelligent AI is possible soon and present the main risk and aliens are far. Such choice boils down to discussion about AI risk in general.
> assumption that advance civilization has larger receivers
If they are more advanced than us, wouldn’t they either have aligned AI or be AI? In that case, I’m not sure what warning them about our possible AI would do for them?
Request for help and also expose our AI future AI as potentially misaligned so it would have double down to pretend to be aligned.
Consider cosmic distances. How much time do we really buy someone if we make SI within decades? A huge civ where the reaction time matters is already pretty capable and the odds don’t improve much for them. On the other hand, the risk of a strong, directed signal beacon is large. We expose ourselves AND undermine ourselves if we make it through.
EDIT: I disagree with logic that signal would slow down an expansive Earth-bound SI. Might as well make it hurry up.
Could be a lot of time. The Andromeda galaxy is 2.5 million light years away from Earth. Say an AI takes over next year and sends a virus to a civilization in this galaxy that would successfully take over if humans didn’t first issue a warning. Because of the warning the Earth Paperclip maximizer has to send a ship to the Andromeda civilization to take over, and say the ship goes at 90% of the speed of light. That gives the Andromeda civilization 280,000 years between when they get humanity’s warning message and when the paperclip maximizer’s ship arrives. During that time the Andromeda civilization will hopefully upgrade its defenses to be strong enough to resist the ship, and then thank humanity by avenging us if the paperclip maximizer has exterminated humanity.
That’s about one round trip across the Andromeda galaxy at the speed of light.
It’s not really like that. It’s more like two expanding spheres at sub-lightspeed inside wider spheres of info at lightspeed.
The bad AI sphere(s) meets the good AI sphere(s) somewhere. If they meet deep inside the Andromeda galaxy, bad AI already has some clear advantage.
Or did I go wrong somewhere? These are complex assumptions.
Our warning message can be received in many points “simultaneously”—so they don’t need to spend more time exchanging information across Andromeda galaxy and can start preparing locally.
The message we send goes at the speed of light. If the AI has to send ships to conquer it probably has to go slower than the speed of light.
This video explains faster
https://youtu.be/fVrUNuADkHI?si=cpiFri1vmoaKeaXy
the aliens have not yet considered creating an artificial mind. our signal informs them of the possibility. they outpace us.
People were too busy worrying whether China or America will win the race, to see Sirius sneaking up on us.
Given the noise floor, how much energy do you think it would take to send that in an omnidirectional broadcast, while still making it narrowband enough to obviously be a signal?
Doesn’t need to be omnidirectional. Focus on most perspective locations like like nearby stars, our galaxy center, Andromeda’s most suitable parts.
OK, that gets you something. But suppose that you had a twin at Proxima Centauri, with the same tech level as we have. Could you send a message that your twin could receive? One big enough to carry the information in question here? How long would it take, and how much money would each of you have to invest in the equipment?
As I understand it, we’re getting pretty good at sensing small signals these days, and we still find it challenging to notice entire planets. Scaling up obviously helps, but the cost scales right along with the capability. You can say, as you do elsewhere, that “advance civilization has larger receivers”, but why would they waste resources on building such receivers?
I asked AI about it and it told me that large radiotelescope may suffice. However, the main uncertainty is receivers equipment. If they are on Proxima, they suspect that there is life near Sun, so constant observations are possible—but the size of receiver depends on Kardashev level of civilization.
Advance civilizations will have larger receiving dishes, may be the size of Dyson spheres—but such civilizations are farther (or they will be here).
Therefore, relation distanace/reciver-size is approximately constant.