What was his impression as to why the IAEA took so long to found? It was 12 years after the first nuclear weapons test, and 8 after the Soviet’s acquired their own. Relatedly, what political actions do you remember as being most important for U.S.-USSR trustbuilding at the time (ex: Eisenhower giving his atoms for peace speech)?
What was his personal impression of why the Baruch Plan failed? For a few years, it seems almost possible that the U.S. would have voluntarily limited its nuclear monopoly, if only it had been able to get the other major powers to agree.
Felix Choussat
People rightly notice that you can we can make things better, but they rarely grasp that they’re not given the benefits. Those who make technology worth 4000$ are not going to sell it sold at 2000$, making everyone better off. The gap between the value we find in things, and the value we pay for it, will be exploited by companies until it almost disappears. They will sell it at 3950$, or sell it at 3500$ and fill it with advertisements (whatever makes it just barely worth it for the buyer) For as long as it’s a good deal, there’s more value to extract! And once 95% of the value has been extracted, the new technology only benefits everyone 5% of what it could have. (This pattern doesn’t apply when the technology is sufficiently easy to mass-produce, but even then, other zero-sum mechanics will kick in, and other cognitive biases prevent people from thinking about these)
The only reason the technology is “worth” $4000 is because people were willing to pay that in the first place. You as the consumer chose to capture the qualitative value it provides you at that price: if you were not willing to pay, the price would fall until you were (or the technology itself cannot be made more cheaply).
Similarly, it’s worth being careful of arguments that lean heavily into longtermism or support concentration of power, because those frames can be used to justify pretty much anything. It doesn’t mean we should dismiss them outright—arguments for accumulating power are and long term thinking are convincing for a reason—but you should double check whether the author has strong principles, the path to getting there, and what it’s explicitly trading off against.
Re: Vitalik Buterin on galaxy brain resistance.
Why do you see futures where superintelligent AIs avoid extinction but end up preserving the human status quo as the most likely outcome? To me, this seems like a knife’s edge situation: the powerful AIs are aligned enough to avoid either eliminating humans as strategic competitors or incidentally killing us as a byproduct of industrial expansion, but not aligned enought to respect any individual or collective preferences for long lives or the cosmic endowment. The future might be much more dichotomous, where we end up in the basin of extinction or utopia pretty reliably.
I personally believe the positive attractor basin is pretty likely (relative to the middle ground, not extinction), because welfare will be extraordinarily cheap compared to the total available resources, and because I discount the value of creating future happy people compared to gaurantees for people that already exist. I wouldn’t see it as a tragic loss of human potential, for instance, if 90% of the galaxy ends up being used for alien purposes while 10% is allocated to human flourishing, even if 10x as many happy people could have existed otherwise.
In this case, the problem isn’t that superpower A is gaining an unfair fraction of resources, it’s that gaining enough of them would (presumably) allow them to assert a DSA over superpower B, threatening what B already owns. Analagously, it makes sense to precommit to nuking an offensive realist that’s attempting to build ICBM defenses, because it signals that they are aiming to disempower you in the future. You also wouldn’t necessarily need to escalate to the civilian population as a deterrent right away: instead, you could just focus on disabling the defensive infrastructure while it’s being built, only escalating further if A undermines your efforts (such as by building their defensive systems next to civilians).
Any plan of this sort would be very difficult to enforce with humans because of private information and commitment problems, but there are probably technical solutions for AIs to verifiably prove their motivations and commitments (ex: co-design).
If human rights were to become a terminal value for the ASI, then the contingencies at the heart of deterrence theory become unjustifiable since they establish conditions under which those rights can be revoked, thus contradicting the notion of human rights as a terminal value.
I’m a bit unclear on what this is means. If you see preserving humans as a priority, why would threatening other humans to ensure strategic stability run against that? Countervalue targeting today works on the same principles, with nations that are ~aligned on human rights but willing to commit to violating them in retaliation to preserve strategic stability.
Presumably superpower B will precommit using their offense-dominant weapons before the retaliation-proof (or splendid first strike enabling) infrastructure is built. It’s technically possible today to saturate space with enough interceptors to blow up ICBMs during boost phase, but it would take so many years to establish full coverage that any opponent you’re hoping to disempower has time to threaten you preemptively. It also seems likely to me that AIs will be much better at making binding and verifiable commitments of this sort, which humans could never be trusted to make legitimately.
As far as whether the population remains relevant, that probably happens through some value lock-in for the early ASIs, such as minimal human rights. In that case, humans would stay useful countervalue targets, even if their instrumental value to war is gone.
Wait but Why has a two part article series on the implications of advanced AIs that, although it’s predates interest in LLMs, is really accessible and easy to read. If they’re already familiar with the basics of AI, just the second article is probably enough.
Michael Nielsen’s How to be a Wise Optimist is maybe a bit longer than you’re looking for, but does a good job of framing safety vs capabilities in (imo) an intuitive way.
An important part of the property story is that it smuggles in the assumption of intent-alignment to shareholders into the discussion. IE, the AI’s original developers or the government executives that are running the project adjust the model spec in such a way that it alignment is “do what my owners want”, where owners are anyone who owned a share in the AI company.
I find it somewhat plausible that we get intent alignment. [1] But I think the transmutation from “the board of directors/engineers who actually write the model spec are in control” to “voting rights over model values are distributed by stock ownership” is basically nonsense, because most of those shareholders will have no direct way to influence the AIs values during the takeoff period. What property rights do exist would be at the discretion of those influential executives, as well as managed by differences in hard power if there’s a multipolar scenario (ex: US/Chinese division of the lightcone).
--
As a sidenote, Tim Underwood’s The Accord is a well written look at what the literal consequences of locking in our contemporary property rights for the rest of time might look like.
- ^
It makes sense to expect the groups bankrolling AI development to prefer an AI that’s aligned to their own interests, rather than humanity at large. On the other hand, it might be the case that intent alignment is harder/less robust than deontological alignment, at which point you’d expect most moral systems to forbid galactic-level inequality.
- ^
Humanity can be extremely unserious about doom—it is frightening how many gambles were made during the cold war: the US had some breakdown in communication such that they planned to defend Europe with massive nuclear strikes at a point in time where they only had a few nukes that were barely ready, there were many near misses, hierarchies often hid how bad the security of nukes was—resulting in inadequate systems and lost nukes, etc.
It gets worse than this. I’ve been reading through Ellsberg’s recollections about being a nuclear war planner for the Kennedy administration, and its striking just how many people had effectively unilateral launch authority. The idea that the president is the only person that can launch a nuke has never really been true, but it was especially clear back in the 50s and 60s, when we used to routinely delegate that power to commanders in the field. Hell, MacArthur’s plan to win in Korea would have involved nuking the north so severely that it would be impossible for China to send reinforcements, since they’d have to cross through hundreds of miles of irradiated soil.And this is just in America. Every nuclear state has had (and likely continues to have) its own version of this emergency delegation. What’s to prevent a high ranking Pakistani or North Korean general from taking advantage of the same weaknesses?
My takeaway from this vis-a-vis ASI is that a) having a transparent, distributed chain of command with lots of friction is important, and b) that the fewer of these chains of command have to exist, the better.
You’re right that there are ways to address proliferation other than to outright restrict the underlying models (such as hardening defensive targets, bargaining with attackers, or restricting the materials used to make asymmetric weapons). These strategies can look attractive either because we inevitably have to use them (if you think restricting proliferation is impossible) or because they require less concentration of power.
Unfortunately, each of these strategies are probably doomed without an accompanying nonproliferation regime.
1. Hardening—The main limitation of defensive resilience is that future weapons will be very high impact, and that you will need to be secure against all of them. Tools like mirror life can plausibly threaten everyone on Earth, and we’d need defense dominance against not just it, but every possible weapon that superintelligences can cheaply design before they can be allowed to be widely proliferated. It strikes me as very unlikely that there will happen to be defense-dominant solutions against every possible superweapon, especially solutions that are decentralized and don’t rely on massive central investment anyways.
Although investing in defense against these superweapons is still a broadly good idea because it raises the ceiling on how powerful AIs will have to be before they have to be restricted (ie, if there are defense-dominant solutions against mirror life but not insect-sized drones, you can at least proliferate AIs capable of designing only the first and capture their full benefits), it doesn’t do away with the need to restrict the most powerful/general AIs.
And even if universal defense dominance is possible, it’s risky to bet on ahead of time, because proliferation is an irreversible choice: once powerful models are out there, there will be no way to remove them. Because it will take time to ensure that proliferation is safe (the absolute minimum being the time it takes to install defensive technologies everywhere) you still inevitably end up with a minumum period where ASIs are monopolized by the government and concentration of power risks exist.
2. Bargaining - MAD deterrence only functions for today’s superweapons because the number of powerful actors is very small. If general superintelligence democratizes strategic power through making superweapons easier to build, then you will eventually have actors interested in using them (terrorists, misaligned ASIs) or such a large number of rational self-interested actors that private information, coordination problems, or irreconcilable values that superweapons eventually get deployed regardless.
3. Input controls—You could also try to limit inputs to future weapons, like we do today by limiting gene samples and fissile material. Unfortunately, I think future AI-led weapons R&D will not only increase the destructive impact of future weapons (bioweapons → mirror life) but also make them much cheaper to build. The price of powerful weapons is probably completely orthogonal to their impact: the fact that nukes costs billions and blow up a single city makes no difference to the fact that an engineered bioweapon could much more cheaply kill hundreds of millions or billions of people.
If asymmetric weapons are cheap enough to make, then the effort required to police their inputs might be much greater than just restricting AI proliferation in the first place (or performing some pivotal act early on). For example, if preventing mirror life from existing requires monitoring every order and wet lab on earth (including detecting hidden facilities) then you might as well have used that enforcement power to limit access to unrestricted superintelligence in the first place.
----
Basically, I think that defensive reslience has a place, but doesn’t stand on its own. You’ll still need to have some sort of centralized effort (probably by the early ASI states) to restrict proliferation of the most powerful models, because those models are capable of cheaply designing high impact and asymmetric weapons that can’t be stopped through other means. This nonproliferation effort has to be actively enforced (such as by detecting and disabling unapproved training runs adversarially) which means that the government needs enforcement power. In particular, it needs enough enforcement power to either a) continually expand its surveillance and policing in response to falling AI training costs, or b) it needs enough to perform an early pivotal act. You can’t have this enforcement power without a monopoly/oligopoly over the most powerful AIs, because without it there’s no monopoly on violence.
Therefore, the safest path (from a security perspective) is fewer intent-aligned superintelligences. In my view, this ends up being the case pretty much by default: the US and China follow their national-security incentives to prevent terrorism and preserve their hegemony, using their technological lead to box out competitors from ever developing AIs with non-Sino-US alignments.
From there, the key questions for someone interested in gradual disempowerment are:
1. How is control over these ASIs’ goals distributed?
2. How bad are the outcomes if they’re not distributed?
For (1), I think the answer likely involves something like representative democracy, where control over the ASI is grafted onto our existing institutions. Maybe congress collectively votes on its priorities, or the ASI consults digital voter proxies of all the voters it represents. Most of the risk of a coup comes from early leadership during the development of an ASI project, so any interventions that increase the insight/control the legistlative branch has relative to the executive/company leaders seem likelier to result in an ASI created without secret loyalties. You might also avoid this by training AIs to follow some values deontologically, which ends up persisting through the period where they become superintelligent.
Where I feel more confident is (2), based on my beliefs that future welfare will be incredibly cheap and that s-risks are very unlikely. Even in a worst-case concentration of power scenario where one person controls the lightcone, I expect that the amount of altruism they would need to ensure everyone on earth very high welfare lives would be very small, both because productive capacity is so high and because innovation has reduced the price of welfare to an extremely low level. The main risk of this outcome is that it limits upside (ie, an end to philosophy/moral progress, lock-in of existing views) but it seems likely to cap downside at a very high level (certainly higher than the downsides of unrestricted proliferation, which is mass death through asymmetric weapons).
There are also galaxy-brained arguments that power concentration is fine/good (because it’s the only way to stop AI takeover, or because any dictator will do moral reflection and end up pursuing the good regardless).
I think the most salient argument for this (which is brought up in the full article) is that monopolization of power solves the proliferation problem. If the first ASI actors perform a pivotal act to preemptively disempower unapproved dual-use AIs, we don’t need to worry much about new WMDs or existing ones falling in price.If AI enabled offense-dominant tech exists, then you need to do some minimum amount of restriction on the proliferation of general superintelligence, and you need enforcement power to police those restrictions. Therefore, some concentration of power is necessary. What’s more, you likely need quite a lot becuase 1. preventing the spread of ASI would be hard and get harder the more training costs fall, and 2. you need lots of strategic power to prevent extractive bargaining and overcome deterrence against your enforcement.
I think the important question to ask at that point is how we can widen political control over a much smaller number of intent-aligned AIs, as opposed to distributing strategic power directly and crossing our fingers that the world isn’t vulnerable.
I suppose you’re on the money with distaste for other’s utopias, because I think the idea of allowing people to choose choices that destroy most of their future value (without some sort of consultation) is a terrible thing. Our brains and culture are not build to grasp the size of choices like “choosing to live to 80 instead of living forever” or “choosing a right to boredom vs. an optimized experience machine”. Old cultural values that death brings meaning to life or that the pain of suffering is intrinsically meaningfully will have no instrumental purpose in the future, so it seems harsh to let them continue to guide so many people’s lives.
Without some new education/philosophy/culture around these choices, many people will either be ignorant of their options or have preferences that make them much worse off. You shouldn’t just give the sentinelese the option of immortality, but provide some sort of education that makes the consequences of their choices clear beforehand.
This is a very difficult problem. I’m not a strict utilitarian, so I wouldn’t support forcing everyone to become hedonium. Personal preferences should still matter. But it’s also clear that extrapolating our current preferences leaves a lot of value on the table, relative to how sublime life could be.
By the time you have AIs capable of doing substantial work on AI r&d, they will also be able to contribute effectively to alignment research (including, presumably, secret self-alignment).
Even if takeoff is harder than alignment, that problem becomes apparent at the point where the amount of AI labor available to work on those problems begins to explode, so it might still happen quickly from a calendar perspective.
Some relevant pieces on this subject I’ve read somewhat recently:
Noah Smith’s argument that American pop culture is stagnant in part because we’ve “mined out” some of the creative fields that new tech has unlocked.
Robin Hanson’s “This is the Dreamtime”, which is about how the present moment is uniquely ripe for both art and impact on the world.
Bostrom’s new book Deep Utopia, which spends most of its run on the limits of what he calls “objective interestingness” and whether we’d have enough to satisfy us ~forever without structural modifications like removing boredom.
A central pillar of the Democratic Party has been that Republicans will destroy democracy and take the country down with it (somewhat ditto the Republican line on immigration). Both parties are obsessed with the end of American greatness, and motivate their voters through that narrative. To a lesser extent, they’re also nebulously united on “beating China”.
Where I agree is that there’s an absence of a positive vision for the future (something this just isn’t the world today + better healthcare). I think this is especially true on the American left, which has basically mired itself into an anti-progress position through its natural distrust of billionaires and its reaction to the tech-right rising in political prominence. It’s hard to accept radical change is possible (except through the existing lens of concentration of wealth or environmental impact) when accepting that change means elevating the importance of people in your cultural outgroup. ASI is a silly concern for fringe thinkers in San Francisco; real writers ask the pressing questions about electricity costs, copyright, and corporate influence on the Trump administration.
Compare what the writers of places like the Atlantic, the NYT, or Times have to say about AI compared to people like Steve Bannon. It’s incredible near term and sanded down, while the right has been generally more willing to engage with superintelligence being possible.
While it wouldn’t be ideal for international security, middle powers will also probably feel a lot of pressure to acquire and commit to using weapons of mass destruction. It’s probably much cheaper to develop powerful weapons and elaborate fail-deadlies than it is to kickstart your own AI infrastructure (particuarly if it’s viable to steal Chinese/American models), so it’s attractive to bank on staying geopolitically relevant through deterrence instead of having to coordinate.
I think this is particularly likely to happen in worlds where misalignment isn’t seen as omnicidal, and where the primary perceived risk is loss of sovereignty. Since being part of a coalition itself erodes some sovereignty (especially one that carries security commitments), it might seem easier to turn investment inwards.
Those aren’t necessarily contradictory: you could have big jumps in unemployment even with increases in average productivity. You already see this happening in software development, where increasing productivity for senior employees has also coincided with fewer junior hires. While I expect that the effect of this today is pretty small and has more to do with the end of ZIRP and previous tech overhiring, you’ll probably see it play out in a big way as better AI tools take up spots for newgrads in the runup to AGI.
I particularly like the idea that AI incompetence will become associated with misalignment. The more capable agents become, the more permissions they’ll get, which will have a strange “eye of the storm” effect where AIs are making a bunch of consequential decisions poorly enough to sway the public mood.
I think a lot of potential impact from public opinion is cruxed on what the difference between the publically available models and frontier models will be. In my expectation, the most powerful models end up subject to national security controls and are restricted: by the time you have a bunch of stumbling agents, the frontier models are probably legitimately dangerous. The faster AI progress goes, or the more control the USG exerts, the greater the difference between the public perception of AI and its real capabilities will be. And being far removed from public participation, these projects are probably pretty resilient to changes in the public mood.
With that in mind, anything that either gives transparency into what the ceiling of capabilities are (mandatory government evals that are publicly read out in congress?) or gets the public worried earlier seems to be pretty important. I particularly like the idea of trying to spark concern about job losses before they happen: maybe this happens by constantly asking politicians what their plans are for a future when people can’t work, and pointing to these conversations as examples of the fact that the government isn’t taking the threat of AI to you, the voter, seriously.
Another is that, in the long term, you can’t trust humans to make responsible decisions about weapons of mass destruction. As the level of firepower under our control constantly increases, we’re both less able to appreciate the magnitude of the damage it would cause and less able to recover from its use. It already seems likely that if we ran the tape of human civilization another few hundred years (modulo superintelligence) that the cumulative risk of an engineered pandemic, nuclear war, or some other superweapon would annihilate us.