Researcher at Missing Measures
Adam Scholl
Senator Bernie Sanders is planning to introduce legislation that would ban the construction of new AI data centers. You can find his video announcement here, and here is the transcript:
Thanks very much for joining me. I will soon be introducing legislation calling for a moratorium on the construction of new data centers.
Now, as a result, I’ve been called a luddite, anti-innovation, anti-progress, pro-Chinese, among many other things. So why am I doing that? Why am I calling for a moratorium on the construction of new data centers?
Bottom line: We are at the beginning of the most profound technological revolution in world history. That’s the truth. This is a revolution which will bring unimaginable changes to our world. This is a revolution which will impact our economy with massive job displacement. It will threaten our democratic institutions. It will impact our emotional well-being, and what it even means to be a human being. It will impact how we educate and raise our kids. It will impact the nature of warfare, something we are seeing right now in Iran.
Further, and frighteningly, some very knowledgeable people fear that that what was once seen as science fiction could soon become a reality—and that is that superintelligent AI could become smarter than human beings, could become independent of human control, and pose an existential threat to the entire human race. In other words, human beings could actually lose control over the planet.
And in the midst of all of that, all of this transformative change, what I have to tell you is that the United States Congress hasn’t a clue, not a clue, as to how to respond to these revolutionary technologies and protect the American people. And it’s not only not having a clue, they’re out busy raising money all day long from AI and their super PACs, which is a whole other problem.
As many of you know, the AI revolution is being pushed by the wealthiest people in our country, including Elon Musk, Jeff Bezos, Larry Ellison, Mark Zuckerberg, Peter Thiel, and others. All of these people are multi-billionaires who, if they are successful at AI, will become even richer and more powerful than they are today.
What I want to do now is not tell you my fears regarding AI and robotics. I want you to actually hear from them, the billionaires who are pushing these technologies. Listen carefully to what they are saying.
Elon Musk, wealthiest person alive, stated that quote, “AI and robots will replace all jobs.” All jobs. “Working will be optional.” End of quote.
Dario Amodai, the CEO of Anthropic, predicted that quote, “AI could displace half of all entry-level white collar jobs in the next 1 to 5 years.” And that quote, “Humanity is about to be handed almost unimaginable power, and it is deeply unclear whether our social, political, and technological systems possess the maturity to wield it.” End quote. That’s Amodai.
According to Demis Hassabis, the head of Google’s DeepMind—this is Google’s DeepMind—the AI revolution will be 10 times bigger than the industrial revolution, and 10 times faster. All right, you got that? That means it will have a 100 times greater impact on society than the industrial revolution had.
Jeff Bezos, the fourth richest person in the world, has been pushing his staff for years to think big and envision what it would take for Amazon, which he owns, to fully automate its operations and replace at least 600,000 warehouse workers with robots. 600,000 jobs gone. Robots doing the work.
Bill Gates, also one of the wealthiest people on Earth, predicted that humans, quote, “won’t be needed for most things,” end quote, such as manufacturing products, delivering packages, or growing food over the next decade, due to artificial intelligence.
Mustafa Suleyman, the CEO of Microsoft AI, said most white-collar work quote, “will be fully automated by an AI within the next 12 to 18 months” end quote.
Jim Farley, the CEO of Ford, predicted that AI will eliminate quote, “nearly half, literally half, of all white-collar jobs in the US” end quote, within the next decade.
I want you to hear this one. Larry Ellison—also one of the richest people on Earth, and a major investor in AI—said that there will be an artificial intelligence-powered surveillance state where, quote, “citizens will be on their best behavior because we’re constantly recording and reporting everything that is going on.” End quote.
Dr. Jeffrey Hinton, considered to be the “godfather of AI,” believes there is a quote “10% to 20% chance for AI to wipe us out.” End quote.
Mark Zuckerberg, the fifth richest person in the world, is building a data center in the state of Louisiana—a data center that is the size of Manhattan, and will use three times the quantity of electricity that the entire city of New Orleans uses every year.
All right. Now, for many years now, leading experts have called for regulation and reasonable pauses to the development of artificial intelligence, to ensure the safety—the very safety—of humanity. Let’s go back to our good friend Elon Musk. He said back in 2018, quote—this is Elon Musk—“Mark my words, AI is far more dangerous than nukes. So why do we have no regulatory oversight? This is insane.” End quote, Elon Musk.
In March of 2023, over 1,000 business leaders in the big tech industry, prominent scientists, AI researchers, and academics co-signed an open letter entitled, quote, “Pause Giant AI Experiments” end quote, stating,
“We must ask ourselves: should we let machines flood our information channels with propaganda and untruth? Should we automate away all the jobs, including the fulfilling ones? Should we develop nonhuman minds that might eventually outnumber, outsmart, obsolete, and replace us? Should we risk control—loss of control—of our civilization?”
“Should we risk loss of control of our civilization? Such decisions must not be delegated to unelected tech leaders. Therefore, we call on all AI labs to immediately pause for at least 6 months the training of AI systems more more powerful than GPT4; this pause should be public and verifiable and include all key actors. If such a pause cannot be enacted quickly, governments should step in and institute a moratorium.” End of quote.
That is what some of the leaders in the AI industry have said. And clearly where we are right now, is that there has not been any pause. There has been massive amounts of competition between one company and the other, between the United States and China. So: bottom line is that, in my view, to protect our workers from losing their jobs, to protect human beings from attacks on their mental health, to protect our kids, to protect the safety of human life: yeah, we need a moratorium on data centers. We need to take a deep breath. We need to make sure that AI and robotics work for all of us, not just a handful of billionaires. Thanks very much.
I believe deeply in the existential importance of using AI to defend the United States and other democracies, and to defeat our autocratic adversaries.
Anthropic has therefore worked proactively to deploy our models to the Department of War
In my view this seems like a terrible idea, since we don’t understand nearly enough about what might end up causing misalignment in practice to be confident that explicitly training AI to aid in the killing of humans might not somehow increase the probability that it ends up interested in the killing of humans more generally.[1] I suppose I do think it seems even worse to train AI in both killing and mass surveillance, but I expect the badness of the former likely swamps the EV of refraining from the latter here.
- ^
I don’t put much stock in the notion of AI personas, personally, but insofar as you do I think you should probably be especially worried about this.
- ^
Curious how bright it was, i.e. what the lux was in the room?
the standard for what a blog is like
Note though that the reference class “blog” is only partially apt. For example, some authors publish on LessWrong in the course of attempting to make or propagate serious intellectual progress, which is a rare aim among bloggers. It seems to me LessWrong’s design has historically been unusually conducive to this rare aim, and personally, this is the main reason I hope and plan to publish more here in the future (and why I’d feel far less since excited about publishing on Medium or Substack or other platforms formatted like standard blogs).
my gut says pretty strongly that you and Adam are erring way too far in the not-publishing direction, and like, I would pay money for you to publish more.
I am interested in debating the principle here (e.g. whether it sometimes makes sense to write books, whether/why most scientific progress so far has involved writing books, etc), but I feel less interested in debating your gut take on the tradeoffs Aysja and I are making personally, since I expect you know nearly nothing about what those are? Most obviously, the dominant term has been illness rather than choices, but I expect you also have near-zero context on the choices, which we have spent really a lot of time and effort considering. I would… I guess be up for describing those in person, if you want.
Yep, definitely! The reason why these are big tomes is IMO largely downstream of the distribution methods at the time.
What distribution differences do you mean? Kepler and Bacon lived before academic journals, but I think all the others could easily have published papers; indeed Newton, Darwin and Maxwell published many, and while Carnot didn’t many around him did, so he would have known it was an option.
It seems more likely to me that they chose to write up these ideas as books rather than papers simply because the ideas were more “book-sized” than “paper-sized,” i.e. because they were trying to discover and describe a complicated cluster of related ideas that was inferentially far from existing understanding, and this tends to be hard to do briefly.
I think that is for most forms of intellectual progress, a better way of developing both ideas and pedagogical content knowledge
It sounds like you’re imagining that the process of writing such books tends to involve a bunch of waterfall-style batching, analogous to e.g. finishing the framing in each room of a house before moving on to the flooring, or something like that? If so, I’m confused why; at least my own experience with large writing projects has involved little of this, I think, though I’m sure writing processes vary widely.
I was pretty with you until this paragraph:
In many ways Inkhaven is an application of single piece flow to the act of writing. I do not believe intellectual progress must consist of long tomes that take months or years to write. Intellectual labor should aggregate minute-by-minute with revolutionary insights aggregating from hundreds of small changes. Publishing daily moves intellectual progress much closer to single piece flow.
Of course intellectual progress doesn’t always require tomes, but I think in many fields of science, important conceptual progress has historically occurred so dominantly via tomes that they can almost be considered its unit. Take for example well-regarded tomes like Astronomia Nova, Instauratio Magna, Principia, Reflections on the Motive Power of Fire, On the Origin of Species, or A Treatise on Electricity and Magnetism—would you guess the discovery or propagation of these ideas would have been more efficient if undertaken somehow more in single piece flow-style? My sense is that tomes are just a pretty natural byproduct of ambitious, large inferential distance-crossing investigations like these.
I do think I’d feel very alarmed by the 27% figure in your position—much more alarmed than e.g. I am about what happened with AIRCS, which seems to me to have failed more in the direction of low than actively bad impact—but to be clear I didn’t really mean to express a claim here about the overall sign of MATS; I know little about the program.
Rather, my point is just that multiplier effects are scary for much the same reason they are exciting—they are in effect low-information, high-leverage bets. Sometimes single conversations can change the course of highly effective people’s whole careers, which is wild; I think it’s easy to underestimate how valuable this can be. But I think it’s similarly easy to underestimate their risk, given that the source of this leverage—that you’re investing relatively little time getting to know them, etc, relative to the time they’ll spend doing… something as a result—also means you have unusually limited visibility into what the effects will be.
Given this, I think it’s worth taking unusual care, when pursuing multiplier effect strategies, to model the overall relative symmetry of available risks/rewards in the domain. For example, whether A) there might be lemons market problems, such that those who are easiest to influence (especially quickly) might tend all else equal to be more strategically confused/confusable, or B) whether there might in fact currently be more easy ways to make AI risk worse than better, etc.
That may be, but personally I am unpersuaded that the observed paradoxical impacts should update us that the world would have been better off if we hadn’t made the problem known, since I roughly can’t imagine worlds where we do survive where the problem wasn’t made known, and I think it should be pretty expected with a problem this confusing that initially people will have little idea how to help, and so many initial attempts won’t. In my imagination, at least, basically all surviving worlds look like that at first, but then eventually people who were persuaded to worry about the problem do figure out how to solve it.
(Maybe this isn’t what you mean exactly, and there are ways we could have made the problem known that seemed less like “freaking out”? But to me this seems hard to achieve, when the problem in question is the plausibly relatively imminent death of everyone).
Great founders and field-builders have multiplier effects on recruiting, training, and deploying talent to work on AI safety [...] If we want to 10-100x the AI safety field in the next 8 years, we need multiplicative capacity, not just marginal hires
I spent much of 2018-2020 trying to help MIRI with recruiting at AIRCS workshops. At the time, I think AIRCS workshops and 80k were probably the most similar things the field had to MATS, and I decided to help with them largely because I was excited about the possibility of multiplier effects like these.
The single most obvious effect I had on a participant—i.e., where at the beginning of our conversations they seemed quite uninterested in working on AI safety, but by the end reported deciding to—was that a few months later they quit their (non-ML) job to work on capabilities at OpenAI, which they have been doing ever since.
Multiplier effects are real, and can be great; I think AIRCS probably had helpful multiplier effects too, and I’d guess the workshops were net positive overall. But much as pharmaceuticals often have paradoxical effect—i.e., to impact the intended system in roughly the intended way, except with the sign of the key effect flipped—it seems disturbingly common to have “paradoxical impact.”
I suspect the risk of paradoxical impact—even from your own work—is often substantial, especially in poorly understood domains. My favorite example of this is the career of Fritz Haber, who by discovering how to efficiently mass-produce fertilizer, explosives, and chemical weapons, seems plausibly to have both counterfactually killed and saved millions of lives.
But it’s even harder to predict the sign when the impact in question is on other people—e.g., on their choice of career—since you have limited visibility into their reasoning or goals, and nearly zero control over what actions they choose to take as a result. So I do think it’s worth being fairly paranoid about this in high-stakes, poorly-understood domains, and perhaps especially so in AI safety, where numerous such skulls have already appeared.
Yeah, certainly there are other possible forms of bias besides financial conflicts of interest; as you say, I think it’s worth trying to avoid those too.
Sure, but humanity currently has so little ability to measure or mitigate AI risk that I doubt it will be obvious in any given case that the survival of the human race is at stake, or that any given action would help. And I think even honorable humans tend to be vulnerable to rationalization amidst such ambiguity, which (as I model it) is why society generally prefers that people in positions of substantial power not have extreme conflicts of interest.
I’m going to try to make sure that my lifestyle and financial commitments continue to make me very financially comfortable both with leaving Anthropic, and with Anthropic’s equity (and also: the AI industry more broadly – I already hold various public AI-correlated stocks) losing value, but I recognize some ongoing risk of distorting incentives, here.
Why do you feel comfortable taking equity? It seems to me that one of the most basic precautions one ought ideally take when accepting a job like this (e.g. evaluating Claude’s character/constitution/spec), is to ensure you won’t personally stand to lose huge sums of money should your evaluation suggest further training or deployment is unsafe.
(You mention already holding AI-correlated stocks—I do also think it would be ideal if folks with influence over risk assessment at AGI companies divested from these generally, though I realize this is difficult given how entangled they are with the market as a whole. But I’d expect AGI company staff typically have much more influence over their own company’s value than that of others, so the COI seems much more extreme).
They typically explain where the room is located right after giving you the number, which is almost like making a memory palace entry for you. Perhaps the memory is more robust when it includes a location along with the number?
I agree AI minds might be very different, and best described with different measures. But I think we currently have little clue what those differences are, and so for now humans remain the main source of evidence we have about agents. Certainly human-applicability isn’t a necessary condition for measures of AI agency; it just seems useful as a sanity check to me, given the context that nearly all of our evidence about (non-trivial) agents so far comes from humans.
Sorry, looking again at the messiness factors fewer are about brute force than I remembered; will edit.
But they do indeed all strike me as quite narrow external validity checks, given that the validity in question is whether the benchmark predicts when AI will gain world-transforming capabilities.
“messiness” factors—factors that we expect to (1) be representative of how real-world tasks may systematically differ from our tasks
I felt very confused reading this claim in the paper. Why do you think they are representative? It seems to me that real-world problems obviously differ systematically from these factors, too—e.g., solving them often requires having novel thoughts.
I think there is more empirical evidence of robust scaling laws than of robust horizon length trends, but broadly I agree—I think it’s also quite unclear how scaling laws should constrain our expectations about timelines.
(Not sure I understand what you mean about the statistical analyses, but fwiw they focused only on very narrow checks for external validity
—mostly just on whether solutions were possible to brute force).
I agree it seems plausible that AI could accelerate progress by freeing up researcher time, but I think the case for horizon length predicting AI timelines is even weaker in such worlds. Overall I expect the benchmark would still mostly have the same problems—e.g., that the difficulty of tasks (even simple ones) is poorly described as a function of time cost; that benchmarkable proxies differ critically from their non-benchmarkable targets; that labs probably often use these benchmarks as explicit training targets, etc.—but also the additional (imo major) source of uncertainty about how much freeing up researcher time would accelerate progress.
It seems to me a meaningfully open question whether automating all human labor will end up net benefiting humans, even assuming we survive; of course it might, but I think much more dystopian outcomes also seem plausible. Markets tend to benefit humans because the price signals we send tend to correlate with our relative needs, and hence with our welfare; I think it is not obvious that this correlation will persist once humans become unable to generate economic value.