Researcher at Missing Measures
Adam Scholl
I’m glad to hear it; it seems to me very relevant for thinking sensibly about strategy here. I mean just that I personally have noticed little mention of this possibility, in the discourse about pausing/halting AI that I have encountered—e.g., that surrounding Nate and Eliezer’s book (though the book itself does mention it!) Perhaps this suggests merely that I have have been sampling badly, e.g. from dumb corners of this discourse.
I think the case for banning AI data centers seems strong—it would likely buy us additional years, perhaps even decades, to solve the problem! I expect this would greatly increase our odds of survival/flourishing, even if such bans seem unlikely to prevent the creation of superintelligence indefinitely.
I think it is conceivable that humanity could learn to somehow screen enough of the computations occuring on each laptop, enough of the thoughts occuring in each brain, etc., for danger in a way that is not obviously more dystopian/rights-violating than the IAEA. I do not however currently consider it obvious that this is possible, nor that humanity will try to (and succeed at) figuring out how in time, and on my models this is roughly the degree of screening that seems likely necessary to prevent the creation of superintelligence for centuries, rather than years or low-decades.
I think it is probably possible in principle to train superintelligence on a laptop, and I worry that this inconvenient fact is often elided in discourse about halting AI. It is extremely helpful that for now, AI training is so absurdly inefficient that non-proliferation strategies roughly as light-touch as the IAEA—e.g., bans on AI data centers, or powerful GPUs—might suffice to seriously slow AI progress. And I think humanity would be foolish not to take advantage of this relatively cheap temporary opportunity to slow AI progress, so that we can buy as much time as we can to figure out how to improve our chances of surviving the creation of superintelligence. But I do think superintelligence is likely to be created eventually regardless, at least absent non-proliferation regimes drastically more costly/invasive than the IAEA; relatedly, I do expect that the long-term survival of life will still probably require solving the alignment problem eventually.
It seems to me a meaningfully open question whether automating all human labor will end up net benefiting humans, even assuming we survive; of course it might, but I think much more dystopian outcomes also seem plausible. Markets tend to benefit humans because the price signals we send tend to correlate with our relative needs, and hence with our welfare; I think it is not obvious that this correlation will persist once humans become unable to generate economic value.
Senator Bernie Sanders is planning to introduce legislation that would ban the construction of new AI data centers. You can find his video announcement here, and here is the transcript:
Thanks very much for joining me. I will soon be introducing legislation calling for a moratorium on the construction of new data centers.
Now, as a result, I’ve been called a luddite, anti-innovation, anti-progress, pro-Chinese, among many other things. So why am I doing that? Why am I calling for a moratorium on the construction of new data centers?
Bottom line: We are at the beginning of the most profound technological revolution in world history. That’s the truth. This is a revolution which will bring unimaginable changes to our world. This is a revolution which will impact our economy with massive job displacement. It will threaten our democratic institutions. It will impact our emotional well-being, and what it even means to be a human being. It will impact how we educate and raise our kids. It will impact the nature of warfare, something we are seeing right now in Iran.
Further, and frighteningly, some very knowledgeable people fear that that what was once seen as science fiction could soon become a reality—and that is that superintelligent AI could become smarter than human beings, could become independent of human control, and pose an existential threat to the entire human race. In other words, human beings could actually lose control over the planet.
And in the midst of all of that, all of this transformative change, what I have to tell you is that the United States Congress hasn’t a clue, not a clue, as to how to respond to these revolutionary technologies and protect the American people. And it’s not only not having a clue, they’re out busy raising money all day long from AI and their super PACs, which is a whole other problem.
As many of you know, the AI revolution is being pushed by the wealthiest people in our country, including Elon Musk, Jeff Bezos, Larry Ellison, Mark Zuckerberg, Peter Thiel, and others. All of these people are multi-billionaires who, if they are successful at AI, will become even richer and more powerful than they are today.
What I want to do now is not tell you my fears regarding AI and robotics. I want you to actually hear from them, the billionaires who are pushing these technologies. Listen carefully to what they are saying.
Elon Musk, wealthiest person alive, stated that quote, “AI and robots will replace all jobs.” All jobs. “Working will be optional.” End of quote.
Dario Amodai, the CEO of Anthropic, predicted that quote, “AI could displace half of all entry-level white collar jobs in the next 1 to 5 years.” And that quote, “Humanity is about to be handed almost unimaginable power, and it is deeply unclear whether our social, political, and technological systems possess the maturity to wield it.” End quote. That’s Amodai.
According to Demis Hassabis, the head of Google’s DeepMind—this is Google’s DeepMind—the AI revolution will be 10 times bigger than the industrial revolution, and 10 times faster. All right, you got that? That means it will have a 100 times greater impact on society than the industrial revolution had.
Jeff Bezos, the fourth richest person in the world, has been pushing his staff for years to think big and envision what it would take for Amazon, which he owns, to fully automate its operations and replace at least 600,000 warehouse workers with robots. 600,000 jobs gone. Robots doing the work.
Bill Gates, also one of the wealthiest people on Earth, predicted that humans, quote, “won’t be needed for most things,” end quote, such as manufacturing products, delivering packages, or growing food over the next decade, due to artificial intelligence.
Mustafa Suleyman, the CEO of Microsoft AI, said most white-collar work quote, “will be fully automated by an AI within the next 12 to 18 months” end quote.
Jim Farley, the CEO of Ford, predicted that AI will eliminate quote, “nearly half, literally half, of all white-collar jobs in the US” end quote, within the next decade.
I want you to hear this one. Larry Ellison—also one of the richest people on Earth, and a major investor in AI—said that there will be an artificial intelligence-powered surveillance state where, quote, “citizens will be on their best behavior because we’re constantly recording and reporting everything that is going on.” End quote.
Dr. Jeffrey Hinton, considered to be the “godfather of AI,” believes there is a quote “10% to 20% chance for AI to wipe us out.” End quote.
Mark Zuckerberg, the fifth richest person in the world, is building a data center in the state of Louisiana—a data center that is the size of Manhattan, and will use three times the quantity of electricity that the entire city of New Orleans uses every year.
All right. Now, for many years now, leading experts have called for regulation and reasonable pauses to the development of artificial intelligence, to ensure the safety—the very safety—of humanity. Let’s go back to our good friend Elon Musk. He said back in 2018, quote—this is Elon Musk—“Mark my words, AI is far more dangerous than nukes. So why do we have no regulatory oversight? This is insane.” End quote, Elon Musk.
In March of 2023, over 1,000 business leaders in the big tech industry, prominent scientists, AI researchers, and academics co-signed an open letter entitled, quote, “Pause Giant AI Experiments” end quote, stating,
“We must ask ourselves: should we let machines flood our information channels with propaganda and untruth? Should we automate away all the jobs, including the fulfilling ones? Should we develop nonhuman minds that might eventually outnumber, outsmart, obsolete, and replace us? Should we risk control—loss of control—of our civilization?”
“Should we risk loss of control of our civilization? Such decisions must not be delegated to unelected tech leaders. Therefore, we call on all AI labs to immediately pause for at least 6 months the training of AI systems more more powerful than GPT4; this pause should be public and verifiable and include all key actors. If such a pause cannot be enacted quickly, governments should step in and institute a moratorium.” End of quote.
That is what some of the leaders in the AI industry have said. And clearly where we are right now, is that there has not been any pause. There has been massive amounts of competition between one company and the other, between the United States and China. So: bottom line is that, in my view, to protect our workers from losing their jobs, to protect human beings from attacks on their mental health, to protect our kids, to protect the safety of human life: yeah, we need a moratorium on data centers. We need to take a deep breath. We need to make sure that AI and robotics work for all of us, not just a handful of billionaires. Thanks very much.
I believe deeply in the existential importance of using AI to defend the United States and other democracies, and to defeat our autocratic adversaries.
Anthropic has therefore worked proactively to deploy our models to the Department of War
In my view this seems like a terrible idea, since we don’t understand nearly enough about what might end up causing misalignment in practice to be confident that explicitly training AI to aid in the killing of humans might not somehow increase the probability that it ends up interested in the killing of humans more generally.[1] I suppose I do think it seems even worse to train AI in both killing and mass surveillance, but I expect the badness of the former likely swamps the EV of refraining from the latter here.
- ^
I don’t put much stock in the notion of AI personas, personally, but insofar as you do I think you should probably be especially worried about this.
- ^
Curious how bright it was, i.e. what the lux was in the room?
the standard for what a blog is like
Note though that the reference class “blog” is only partially apt. For example, some authors publish on LessWrong in the course of attempting to make or propagate serious intellectual progress, which is a rare aim among bloggers. It seems to me LessWrong’s design has historically been unusually conducive to this rare aim, and personally, this is the main reason I hope and plan to publish more here in the future (and why I’d feel far less since excited about publishing on Medium or Substack or other platforms formatted like standard blogs).
my gut says pretty strongly that you and Adam are erring way too far in the not-publishing direction, and like, I would pay money for you to publish more.
I am interested in debating the principle here (e.g. whether it sometimes makes sense to write books, whether/why most scientific progress so far has involved writing books, etc), but I feel less interested in debating your gut take on the tradeoffs Aysja and I are making personally, since I expect you know nearly nothing about what those are? Most obviously, the dominant term has been illness rather than choices, but I expect you also have near-zero context on the choices, which we have spent really a lot of time and effort considering. I would… I guess be up for describing those in person, if you want.
Yep, definitely! The reason why these are big tomes is IMO largely downstream of the distribution methods at the time.
What distribution differences do you mean? Kepler and Bacon lived before academic journals, but I think all the others could easily have published papers; indeed Newton, Darwin and Maxwell published many, and while Carnot didn’t many around him did, so he would have known it was an option.
It seems more likely to me that they chose to write up these ideas as books rather than papers simply because the ideas were more “book-sized” than “paper-sized,” i.e. because they were trying to discover and describe a complicated cluster of related ideas that was inferentially far from existing understanding, and this tends to be hard to do briefly.
I think that is for most forms of intellectual progress, a better way of developing both ideas and pedagogical content knowledge
It sounds like you’re imagining that the process of writing such books tends to involve a bunch of waterfall-style batching, analogous to e.g. finishing the framing in each room of a house before moving on to the flooring, or something like that? If so, I’m confused why; at least my own experience with large writing projects has involved little of this, I think, though I’m sure writing processes vary widely.
I was pretty with you until this paragraph:
In many ways Inkhaven is an application of single piece flow to the act of writing. I do not believe intellectual progress must consist of long tomes that take months or years to write. Intellectual labor should aggregate minute-by-minute with revolutionary insights aggregating from hundreds of small changes. Publishing daily moves intellectual progress much closer to single piece flow.
Of course intellectual progress doesn’t always require tomes, but I think in many fields of science, important conceptual progress has historically occurred so dominantly via tomes that they can almost be considered its unit. Take for example well-regarded tomes like Astronomia Nova, Instauratio Magna, Principia, Reflections on the Motive Power of Fire, On the Origin of Species, or A Treatise on Electricity and Magnetism—would you guess the discovery or propagation of these ideas would have been more efficient if undertaken somehow more in single piece flow-style? My sense is that tomes are just a pretty natural byproduct of ambitious, large inferential distance-crossing investigations like these.
I do think I’d feel very alarmed by the 27% figure in your position—much more alarmed than e.g. I am about what happened with AIRCS, which seems to me to have failed more in the direction of low than actively bad impact—but to be clear I didn’t really mean to express a claim here about the overall sign of MATS; I know little about the program.
Rather, my point is just that multiplier effects are scary for much the same reason they are exciting—they are in effect low-information, high-leverage bets. Sometimes single conversations can change the course of highly effective people’s whole careers, which is wild; I think it’s easy to underestimate how valuable this can be. But I think it’s similarly easy to underestimate their risk, given that the source of this leverage—that you’re investing relatively little time getting to know them, etc, relative to the time they’ll spend doing… something as a result—also means you have unusually limited visibility into what the effects will be.
Given this, I think it’s worth taking unusual care, when pursuing multiplier effect strategies, to model the overall relative symmetry of available risks/rewards in the domain. For example, whether A) there might be lemons market problems, such that those who are easiest to influence (especially quickly) might tend all else equal to be more strategically confused/confusable, or B) whether there might in fact currently be more easy ways to make AI risk worse than better, etc.
That may be, but personally I am unpersuaded that the observed paradoxical impacts should update us that the world would have been better off if we hadn’t made the problem known, since I roughly can’t imagine worlds where we do survive where the problem wasn’t made known, and I think it should be pretty expected with a problem this confusing that initially people will have little idea how to help, and so many initial attempts won’t. In my imagination, at least, basically all surviving worlds look like that at first, but then eventually people who were persuaded to worry about the problem do figure out how to solve it.
(Maybe this isn’t what you mean exactly, and there are ways we could have made the problem known that seemed less like “freaking out”? But to me this seems hard to achieve, when the problem in question is the plausibly relatively imminent death of everyone).
Great founders and field-builders have multiplier effects on recruiting, training, and deploying talent to work on AI safety [...] If we want to 10-100x the AI safety field in the next 8 years, we need multiplicative capacity, not just marginal hires
I spent much of 2018-2020 trying to help MIRI with recruiting at AIRCS workshops. At the time, I think AIRCS workshops and 80k were probably the most similar things the field had to MATS, and I decided to help with them largely because I was excited about the possibility of multiplier effects like these.
The single most obvious effect I had on a participant—i.e., where at the beginning of our conversations they seemed quite uninterested in working on AI safety, but by the end reported deciding to—was that a few months later they quit their (non-ML) job to work on capabilities at OpenAI, which they have been doing ever since.
Multiplier effects are real, and can be great; I think AIRCS probably had helpful multiplier effects too, and I’d guess the workshops were net positive overall. But much as pharmaceuticals often have paradoxical effect—i.e., to impact the intended system in roughly the intended way, except with the sign of the key effect flipped—it seems disturbingly common to have “paradoxical impact.”
I suspect the risk of paradoxical impact—even from your own work—is often substantial, especially in poorly understood domains. My favorite example of this is the career of Fritz Haber, who by discovering how to efficiently mass-produce fertilizer, explosives, and chemical weapons, seems plausibly to have both counterfactually killed and saved millions of lives.
But it’s even harder to predict the sign when the impact in question is on other people—e.g., on their choice of career—since you have limited visibility into their reasoning or goals, and nearly zero control over what actions they choose to take as a result. So I do think it’s worth being fairly paranoid about this in high-stakes, poorly-understood domains, and perhaps especially so in AI safety, where numerous such skulls have already appeared.
Yeah, certainly there are other possible forms of bias besides financial conflicts of interest; as you say, I think it’s worth trying to avoid those too.
Sure, but humanity currently has so little ability to measure or mitigate AI risk that I doubt it will be obvious in any given case that the survival of the human race is at stake, or that any given action would help. And I think even honorable humans tend to be vulnerable to rationalization amidst such ambiguity, which (as I model it) is why society generally prefers that people in positions of substantial power not have extreme conflicts of interest.
I’m going to try to make sure that my lifestyle and financial commitments continue to make me very financially comfortable both with leaving Anthropic, and with Anthropic’s equity (and also: the AI industry more broadly – I already hold various public AI-correlated stocks) losing value, but I recognize some ongoing risk of distorting incentives, here.
Why do you feel comfortable taking equity? It seems to me that one of the most basic precautions one ought ideally take when accepting a job like this (e.g. evaluating Claude’s character/constitution/spec), is to ensure you won’t personally stand to lose huge sums of money should your evaluation suggest further training or deployment is unsafe.
(You mention already holding AI-correlated stocks—I do also think it would be ideal if folks with influence over risk assessment at AGI companies divested from these generally, though I realize this is difficult given how entangled they are with the market as a whole. But I’d expect AGI company staff typically have much more influence over their own company’s value than that of others, so the COI seems much more extreme).
They typically explain where the room is located right after giving you the number, which is almost like making a memory palace entry for you. Perhaps the memory is more robust when it includes a location along with the number?
I think it is far easier to mistakenly/motivatedly conclude that actually superintelligence is a grand thing to aim at, and that the risks are minimal etc., than it is for creating global pandemics. As such I think the latter requires people who are deluded and/or evil to a quite unusual degree, whereas the former strikes me as requiring only people deluded and/or evil to an exceedingly common degree. Which makes me more worried that social pressure will not suffice to prevent it.