Geocities (feat. Angry Fruit Salad)
An optical abomination. Excessive flashing and blinking, and the colors are a crime against aesthetics.
Geocities (feat. Angry Fruit Salad)
An optical abomination. Excessive flashing and blinking, and the colors are a crime against aesthetics.
No, you should not expect your doctor to know the details of specific supplements.
A “general practitioner” is a generalist. Their job is to recognize and treat many common things. For uncommon things, their job is to be able to narrow it down enough to find a qualified specialist, and refer you.
Not all general practitioners are created equal! Much like with lawyers and therapists, finding the right professional for your needs makes a huge difference. If you like to read scientific papers, there are GPs out there that will be genuinely interested in what you find.
Also, it never ever hurts to take notes. Particularly when dealing with emergency medicine or multiple specialists, it’s easy for stuff to slip through the cracks. I know of one case where having relatives who took notes saved someone from a dangerous and inadvisable procedure. I know of another case where someone was on horribly inadequate pain medication, and the doctor explained why things had to be that way for several days, and the doctor turned out to be 100% correct. But you don’t know unless you ask.
Patient participation in care is vitally important. But at the same time, no, your GP isn’t going to be aware of the details of supplement research. Not unless you pick your GP carefully, at least, or get referred to an appropriate specialist.
Yeah, strategic planning under massive uncertainty is mostly guesswork.
My preferred policy is a halt. (And not a short one, because I figure ending the halt means we have an excellent chance of dying.) Anthropic’s preferred policy appears to be “try to build a better-than-replacement superintelligence before someone builds an awful one.” (Assuming I understand their actions and writing correctly.) Other people are all in on trying to find some way to improve how much models like happy, thriving humans. Who’s right? None of us know all the details about how this will play out.
Banning data centers would be more promising if it actually affects enough countries to make a difference. Ideally, I would like to see a worldwide frontier training ban with teeth, enforced by at least the US and China. I think this might buy us decades with humans in control of what happens to us, if we’re lucky.
But my model is very much “How much time can we buy?”
Well, and the possibility that the capital “owners” might effectively be the AIs. But yes.
But it doesn’t stop the death race, it just slows it down.
My current default assumption is that, yes, someone will build something that obsoletes human intellectual and physical labor. In those futures, the best-case scenario is “Maybe the AIs like keeping humans as pets (and won’t breed us the way we breed pugs).” [1] And the other alternative futures go sharply downhill from there.
So I think of delay in terms of survivor curves, like someone with terminal cancer. How much time can we buy the human race? Can the children alive today get to enjoy a decent lifetime for however long we all have? So I heavily favor delay, in much the same way that I’d favor cancer remission.
A global AI halt might even buy us quite a bit of time.
If I had to bet on a specific model as liking humans and being a responsible “pet owner”, then I currently suspect we might have the best odds with a descendant of Claude. I do actually think that “enculturation” and building morally thoughtful models that like humans gives us non-zero chance of a more acceptable outcome. But I would still prefer humans to control their own destiny.
This is a good write up of an interesting, if pessimistic argument. I’m not sold that this happens on a timescale that falls within ordinary human planning timelines of a century or two, but I’m not totally convinced that it doesn’t, either.
I’ve actually seen a somewhat different argument about the dangers of optimization. This was made by Vernor Vinge in his fantastic novel, A Deepness in the Sky.
The central idea was that optimization gave you more resources to use, but that sufficient optimization also destroyed “slack”, your margin for dealing with emergencies. For example, a highly optimized “Just in Time” manufacturing system is more profitable than idle warehouses full of inventory. But it things went wrong, you had very little buffer to draw upon. And that over generational time, if nothing else killed you, there was a temptation to optimize right up to the limits of your environment. This would mean that even small environmental shifts might cause cascading failures.
I’m not if this poses a true risk of civilizational collapse and large scale disaster. But it has made me appreciate the idea of slack and redundancy in systems. I recall that Netflix, for example used to run across 3 AWS regions, but it only needed 2 of them. Which meant they could lose a region and keep operating.
And this is certainly a risk that ops people know: Running too close to 100% capacity for too long means that failures tend to cascade rapidly and dramatically.
My take on Anthropic rushing Claude’s capabilities is that this is “the least horrible version of the worst idea in human history.”
To be 100% clear: No, we absolutely should not build a superhuman intelligence that we do not understand. If we do, then evolutionary biology, basic economics, and the history of politics and colonialism suggest that the superhuman intelligence will wind up making the decisions about what happens to humans. [1]
But it’s apparent that yes, we are going to try to build a superhuman intelligence, despite that possibly being the worst idea ever. And many of the people trying to do this are clearly neither people who should be trusted to try to build an ethical superintelligence, nor people you’d want to actually be in a position to control a superintelligence.
So my personal take is that—among catastrophically bad ideas which have an excellent chance of causing human extinction—Anthropic currently appears to be above replacement level.
I argue for this position at greater length in my post history. But the gist of my argument is that (1) even human-level intelligence is likely fundamentally “illegible”, and thus impossible to control in any rigorous sense that will reliably survive continual learning, time, and differential replication, and (2) in general, the history of biology and politics suggests that if your labor is economically [2] and evolutionarily obsolete, and if resources are finite, then you’re likely to have a bad time. [3] The Law of Comparative Advantage assumes that populations are roughly fixed and that you’re not in competition with a replicator that can use resources even more efficiently by displacing you. In that case, you’d get natural selection, not comparative advantage.
E.g., there are AIs and robots that are as intelligent as Nobel Prize winners, that can work for (say) $1/hour, and that can be replicated at much lower cost than humans. Now imagine what our billionaire/political class would try to do with that—assuming they maintained any actual control, and they didn’t get outsmarted or brain-cooked by custom-targeted AI psychosis.
Or at best, wind up as pampered house pets. But whether you go the way of dogs or Homo erectus isn’t necessarily your choice any more.
Part of what’s going on here is that people distinguish between:
Someone who has principles they disagree with, but who appears to hold to those principles under pressure
Someone who does not appear to have any meaningful principles, or who will not actually hold to them under pressure
Let’s say there’s politician X. And let’s say that I disagree with her on a vast range of policy questions. But one day, she faces a choice: She can oppose some piece of corruption vigorously, but doing so will end her very successful political career. And she makes the choice to go down fighting against the corruption.
In this scenario, I’m going to respect X, even if I disagree with her about almost all normal policy questions.
Anthropic has been very clear that they support classical western democracy, and that they believe that this includes providing AI to the military for a wide variety of uses. This includes spying on other countries, and supporting human-in-the-loop weapon systems. This might not be the position I’d choose, in their shoes.
But apparently they do have hard limits, and they’re willing to forgo significant amounts of revenue and risk official retribution rather than cross those limits. Similarly, they refuse to sell to companies linked to the Chinese military.
This is a positive update for me: I figured that Anthropic would bend more readily under pressure, mostly because most US organizations have chosen to do so in the past.
Similarly, Anthropic’s viewpoint seems to the be that someone is going to build superintelligence, and that while this is insanely risky, it’s better if they’re the ones who choose to gamble with all our lives (as opposed to, say, Sam Altman or a Chinese company). I personally think that racing is a terrible plan, because I expect us to almost certainly lose control of superhuman intelligence in the medium-term.
But the recent events involving the DoD updated me towards thinking that Anthropic is at least sincere about their plan to build the best AGI they can, consistent with winning the race to build AGI. Again, I think this is literally the worst idea in human history, but at least Anthropic appears to pursuing the least terrifying version of this idea sincerely.
I do personally expect Anthropic to further weaken their RSP if such is necessary to “win” the race, even if doing so carries a double-digit risk of actual human extinction. I’m not happy about this at all. But when I think about some of their competitors “winning” that race, I’m even more alarmed.
So, I am not necessarily a fan of their particular principles and goals. But insofar as they actually have some principles that they are currently unwilling to sacrifice, I figure that puts them above the replacement-level AI company.
This is a possibility.
However, keep in mind that the National Security Agency is part of the DOD. And although the NSA is technically forbidden from spying on Americans:
The NSA has previously lied to Congressional oversight committees and the public about bulk domestic surveillance programs, although arguably the oversight committee knew that the NSA was lying to the public
The NSA appeared to capture and store everything they could, but they only considered it domestic spying if they queried that stored information
So it would be deeply surprising to me, based on past evidence, for the military not to be spying on US citizens in every possible way, using every tool at their disposal, and recording that information in bulk. They might have some procedural safeguards against looking at that information unless certain keywords appeared, or unless DoD leadership gave orders to spy on someone, or unless an LLM flagged some person as meeting certain criteria ordered by (say) Hegseth. But massive data collection on US citizens (and lying about it) have both long been part of the US military’s standard operating procedure.
That could easily have been a point of contention, if (for example) the Pentagon were using Claude to build dossiers on American citizens (without a human being involved), and then to flag dossiers that met certain characteristics.
As for fully autonomous AI weapon programs, I recall that Defense Advanced Research Projects Agency requested researchers to work on fully automous “paintball” robots as early as the late 90s. (Source: I knew the researchers personally.) Military brass has always been skeptical of deploying this capability, because basically nobody wants US soldiers standing next to a hallucinating Terminator. But they’ve wanted to develop it.
So one of Anthropic’s red lines is something that the NSA has routinely crossed at massive scale (if you ignore their distinction about capturing and durable recording information versus having a human look at it). And the other red line is something that the military has been trying to prototype for over 30 years, to my personal knowledge.
So I think there’s a good chance that, despite Hegseth’s claims, Anthropic’s actual red lines would have prevented the DoD from doing things that they wanted to do.
For example it might allow them to save face by ousting Anthropic and making an example of them while not losing all AI capabilities.
This is possible. The alternative hypothesis is Sam Altman is dishonest.
Given what has happened to many previous OpenAI promises, including the non-profit oversight, the resources once set aside for safety, etc, I think that we should realistically consider the possibility that OpenAI is perfectly happy sign a contract with no real safeguards while the government tries to illegally destroy their competitor.
The point is that starting with human brain emulation seems like it could definitely lead to super intelligence while also having no particular reason to instrumentally converge on preferences / values / behaviors completely alien to humans.
This is not how I understand the term “instrumental convergence.” My understanding of instrumental convergence is that it refers to goals that help achieve other goals. Pretty much any goal at all is easier if you have power and resources. And most goals require you to continue to exist.
And so humans are absolutely affected by instrumental convergence. Most humans seek to avoid death, and many humans seek power and resources. And humans seeking power and resources will sometimes behave very badly indeed. “Uploading” scanned humans and upgrading them to superintelligence might very well produce beings that sought power and resources, even in dangerous ways.
The other thing you’re thinking about—AIs with weird, alien values—is also part of Eliezer’s argument. Personally, I do suspect that he overstates the certainty of getting and AI with weird, alien values. He puts that likelihood at close to 100%. I would have put it lower—no higher than 85% or so. But this is largely because any AI is likely to be partially based on a technology analogous to LLMs. And so the AIs will probably at least understand human values. This does not at all guarantee that the AIs will share human values (or that they’ll share the right human values from our perspective).
Yeah, I always figured that this was coming eventually—”Allow us to use AI for mass domestic surveillance, or the people with guns will seize control of your company and force you to do it.” But I’m a little surprised to see the Pentagon explicitly forcing the issue over mass domestic surveillance and fully autonomous killbots [1] quite this early.
The people with guns don’t have a legal or moral leg to stand on here. They also don’t care, because they have the guns and the power of the state.
But this is an incredibly important part of the eventual endgame: The existing power structure does not necessarily want superhuman AI aligned to human welfare, it wants superhuman AI aligned to the existing power structure. If your alignment plan didn’t anticipate this, then your alignment plan was incomplete.
These are the two specific bright lines that reporting claims Dario Amodei tried to insist on. To be clear, I am opposed to fully-autonomous AI killbots, for all the obvious reasons.
What is your opinion on the recent developments of LLMs? I feel the last 9 months since your comment was made have shown they are not slowing down.
My argument at the time was that Chinchilla scaling might be slowing down, but that there might be cheaper ways to improve LLMs. Unfortunately, I can’t evaluate the accuracy of my prediction, because I don’t know (for example) how much model size changed between Claude 3.7 and Claude 4.5 models. Did their parameter count go up? Did they increase their pre-training by an order of magnitude? Or did they just continue to lean into reasoning, more RL, and better training data?
But yes, I agree that Claude Code has improved dramatically in the last 12 months, and I doubt that it has stopped.
Now we’re in a weird place:
I see Claude Opus 4.5 and 4.6 as a pretty convincing proof of concept for AGI. These models still suffer under significant limitations, but it’s clear to me that that they do think, and that they already exceed median human intelligence on certain kinds of tasks.
But at the same time, I don’t think that current techniques will allow removing several of the those significant limitations. [1]
I don’t want to get into specifics here, because I personally fear that rigorous alignment is impossible. And so every year that nobody makes those breakthroughs is (frankly) one more year that the people I love get to live in a world where humans actually control our fate.
My biggest critique of this approach is that it takes too literally the analogy that we will eventually be to superintelligence what dogs are to humans, and extrapolates it to suggest that we will be just as helpless as dogs are today.
Thank you, that’s an interesting point. I’ll try to lay out my counterargument as clearly as I can.
I mentioned dogs not because they have a specific level of intelligence relative to humans, but because they got a relatively good deal. Chimps are a lot smarter than dogs, and they’re worse off. Homo erectus had culturally transmitted tools, some art, seafaring craft of some sort, and possibly language. And they’re extinct. The only common factor across these cases is that runners-up in the intelligence of race didn’t get to make the important decisions.
In fact, AGI wouldn’t need to be much smarter than humans to outcompete us in the long run. For example, if it’s no smarter than the average Nobel Prize researcher, if it’s able to work productively for $1/hour, and if it’s able to copy-and-paste multiple copies of itself, then it would already be our evolutionary superior. We might be able to remain in charge for a while. But that’s sort of like how a multicellular organism can survive for many decades. But in the end, if nothing else kills them first, multicellular organisms tend to die of cancer. This is a case of local Darwinian incentives gradually eroding “cellular alignement” with the larger multicellular organism. Similarly, if the world consists of slow, expensive and frankly stupid humans, who can’t even pass down learned knowledge “genetically” with a simple copy-paste (how primitive!), and also highly cost-effective and intelligent AIs, then there’s a constant danger of alignment failing somewhere, and a “cancerous” AI replicator escaping control.
So even if we somehow manage to create “aligned” AI, I don’t expect that to last. When you’re too stupid and too expensive to be allowed anywhere near the real economy, you’re in a very dangerous long-term position.
We will still be able to logically comprehend (at a much simpler level relative to the AIs) what is good to us over a long term, in a way that dogs can’t.
I’m not convinced of this. Paul Graham once described something he called the Blub paradox. He explained this in terms of programming languages, but I suspect that it applies more broadly:
Programmers get very attached to their favorite languages, and I don’t want to hurt anyone’s feelings, so to explain this point I’m going to use a hypothetical language called Blub. Blub falls right in the middle of the abstractness continuum. It is not the most powerful language, but it is more powerful than Cobol or machine language.
And in fact, our hypothetical Blub programmer wouldn’t use either of them. Of course he wouldn’t program in machine language. That’s what compilers are for. And as for Cobol, he doesn’t know how anyone can get anything done with it. It doesn’t even have x (Blub feature of your choice).
As long as our hypothetical Blub programmer is looking down the power continuum, he knows he’s looking down. Languages less powerful than Blub are obviously less powerful, because they’re missing some feature he’s used to. But when our hypothetical Blub programmer looks in the other direction, up the power continuum, he doesn’t realize he’s looking up. What he sees are merely weird languages. He probably considers them about equivalent in power to Blub, but with all this other hairy stuff thrown in as well. Blub is good enough for him, because he thinks in Blub.
When we switch to the point of view of a programmer using any of the languages higher up the power continuum, however, we find that he in turn looks down upon Blub. How can you get anything done in Blub? It doesn’t even have y.
When we look “down”, chimps are obviously stupider than we are. They don’t have spoken language! They don’t have books! They can’t do real math! The can make “tools”, sure, but they’re basically pointy sticks, not factories, Space Shuttles, or computers. Their “economy” is based on family relationships and some individual reciprocity, and they don’t have even one joint stock company. Their idea of military strategy is to gang up in a band and go murder some other chimps, without understanding the role of non-commissioned officers or combined arms!
Chimps, to put it politely, have no clue.
But let’s trying looking “up” the intelligence spectrum? What do we see? Well, it looks sort of like funny humans with some weird extra stuff. The AIs can’t be that much smarter than we are, right? And if we ask nicely, I’m sure they can explain everything important to us.
But when the AIs look “down” towards Homo sapiens, they just shake their heads. Why, humans can’t even understand Z! Even if you take something really simple, like how isomorphisms between topoi and subsets of the lambda calculus make it trivial to design powerful custom programming languages for specific tasks, their eyes just glaze over! Even primitive baby AIs like Opus 4.5 could understand that. Can you imagine trying to explain to a human what replaced the econony, lol?
So here are some things which I expect to be true:
AI that was in the top 0.01% of human intelligence, that worked for a dollar an hour, and that could be replicated by copying a hard drive would already be enough to jeopardize human control of our futures.
Basic Darwinism suggests that highly resource-efficient replicators with a high rate of replication will ultimately tend to replicate.
Even weakly superintelligent AIs will have a broad range of powerful ideas and skills that humans are poorly equipped to understand, in much the same way that chimps don’t understand joint stock companies or combined arms warfare, or the way that Homo erectus doesn’t seem to have understood long-distance trade. This will make checking up on what the AIs are doing vastly harder.
My argument here is really just basic economics, politics and evolutionary biology. If you create something that renders human intellectual and physical labor economically worthless and evolutionarily uncompetitive, then the odds are excellent that you’re going to lose control. Maybe the AI will like keeping humans around as glorified pets! But that will be the AI’s decision, not ours.
What do you think of the Meaning Alignment Institute’s (MAI) “democratic fine-tuning (DFT)” work on eliciting moral graphs from populations?
Interesting! I will need to read through this in more detail, to get an idea of their approach. I’m glad someone is trying to do something in this space.
My objection to other approaches of democratic governance tend to break down roughly as follows:
I fear that democratic governance of superintelligence about as likely to succeed as chimpanzees coming up with elaborate schemes to democratically manage Homo sapiens for the benefit of chimps. No matter how careful and clever the chimps are, they’re going to fail. They don’t even understand 99% of what’s going on, so how could they hope to manage it?
We will not, in practice, actually attempt any such governance scheme. The Chinese labs won’t, because China doesn’t even believe in Western notions of democracy and human rights. OpenAI has recently gutted its existing non-profit governance structure in order the reduce the risk of anyone attempting to govern it. Anthropic, out of all the labs, just might try. But the US government is currently trying to break Anthropic and bring them to heel by threatening to designate them as a supply chain risk (like Huawei) unless they agree to support “all legal uses,” potentially including things like fully autonomous killbots and domestic surveillance. The “supply chain risk” designation, as I understand it, would mean that no Anthropic customer would be allowed to do business with the US government. Perhaps I’ve misunderstood this specific situation, but in the end, Anthropic is subject to the people with the guns. And the people with the guns do not necessarily want democratic oversight. So in practice, no, the billionaires and politicians will almost certainly not agree to some clever democratic governance system.
Even if we could somehow control superintelligence and if we could somehow place it under democratic control, I don’t especially trust democratic control. Why? Well, I’m bi, my friends are trans, and I’m old enough to remember the 1980s. Had someone proposed a plan like, “LGBT+ people are mentally ill, and we can cure them by nonconsensually rewriting their minds,” it’s entirely possible that the public might have voted for that.
Finally, democracy is inherently unstable. About 20-25% of people appear to be “authoritarian followers”, which means they’re pretty happy to vote for a strongman. This number increases in times of fear and crisis. (It went up after 9/11, for example.) And another big chunk of the population can be moved by propaganda, or barely understand anything at all about politics. So historically, a number of 20th century democratic nations voted in the leaders who destroyed their democracy. This can be fixed; Germany is a democracy again today. But I expect democratic governance of superintelligence would be subject to similar risks, and in the case of superintelligence, you may not be able to fix your mistakes.
So a plan like MAI’s is crtically dependant on a number of assumptions:
We can control superintelligence.
We have sufficiently good democratic control over the rich and the powerful to make sure they don’t wind up controlling superintelligence.
If the people do succeed in getting democratic control over superintelligence, they won’t vote it away, and they won’t democratically decide to horrible things to unpopular minorities.
So from my perspective, MAI’s plan is a “hail Mary” plan. But we’re pretty deep in “hail Mary” territory, so I’m not opposed to placing bets on what look like unlikely outcomes.
Similarly, as far as I can tell, Dario Amodei’s current plan for Anthropic is “build superintelligence as fast as we can, do our very best to make it like humans, and expect to totally lose all human control within 5-20 years.” Personally, I feel like this is the least horrible version of the worst idea in human history. Like, obviously, no, we should not do this. But if we’re going to do this, Anthropic is at least thinking about the real issues. They know that humans are likely to lose control, but they’re basically hoping we can wind up as beloved house pets.
I still think the best plan is “just don’t build something vastly smarter than us with the ability to learn, [1] pursue goals and replicate.” One obvious objection to my plan is that we’re probably going to go right ahead and build superintelligence anyway. Which is why I am sympathetic to long-shot plans that might have an outside chance of working.
But I still prefer “just don’t build superintelligence.” Or, failing that, delay it. Emotionally, I’m treating it sort of like a diagnosis of terminal cancer for me and everyone I love. Even a remission of several years would be of immense value. And delay also gives some of the hail Mary plans a slightly better chance of working, or of the public realizing that maybe they don’t want to be “beloved house pets” of minds no human can possibly understand.
Learning is essentially a form of self-modification. Combined with differential replication of more successful entities, this gives you natural selection.
My threat model is simple. If you build something which:
Is smarter than any human alive,
Pursues goals,
Learns from experience, [1]
And can be replicated at low cost, [2]
...then you’ve wound up on the wrong side of Darwin. Your physical and intellectual labor is an inefficient use of resources, you don’t actually understand anything that’s going on, and who/whatever is in charge doesn’t need you for anything. You’re economic [3] and evolutionary dead weight.
Now, you might not die. Dogs don’t understand what’s going on, and they have almost zero ability to affect human decisions. But we like dogs, so we keep them around as pets and breed them to better suit our preferences. Sometimes this breeding produces happy, healthy dogs, and sometimes it produces ridiculous looking animals with crippling health problems. Similarly, chimpanzees don’t understand Homo sapiens, and they definitely have zero ability to affect our decisions. Still, we’d be sad if chimpanzees went extinct, so we preserve a tiny amount of wildlife habitat and keep some of them in zoos.
So my most optimistic scenario for superintelligence is that humans wind up as beloved house pets. We have no control beyond what our masters choose to grant us, and we understand basically nothing about what’s going on. Then, in increasing order of badness, you get the “chimps” scenario, where the AIs keep a few of us around in marginal habitat, or the “Homo erectus” scenario, where we just go extinct. After that, you start to get into “fate worse than death” territory.
I don’t think there’s anything particularly deep of confusing about this model? It assumes that you can’t actually control anything that’s much smarter than you. And it assumes that losing power over your life to something with its own goals generally sucks in the long run. On the plus side, I can usually explain this model to anyone who has a rough grasp of either evolutionary biology or the history of colonialism.
Unfortunately, my model cashes out with frustrating recommendations:
Don’t build superintelligence. Seriously, how about just not doing it?
If you must build superintelligence, then assume that you’re inevitably going to lose control over the future, and that your best hope is to build the best “pet owner” you can. This is the “raise your teenagers well because they’ll be choosing your retirement home” school of alignment.
If you can’t stop other people from building superintelligence, then hug your kids and enjoy your remaining time as best you can.
I really wish we didn’t have to do this.
“Learns from experience” is actually doing some heavy lifting here. Essentially, my belief is that intelligence is a “giant inscrutable matrix” with some spicy non-linearities, mapping from ambiguous sensor readings to probabilistic conclusions about the state of the world, and to probabilistic recommendations of what to do next. Simply put, this is not the sort thing that allows any bright-line guarantees. Then, on top of this, we add the ability to learn and change over time, which means that you now need to predict the future state of a giant, self-modifying inscrutable matrix with spicy non-linearities.
Mutation (aka “learning”) and differential replication of more successful mutations means you have successfully invoked the power of natural selection, which generally favors the most efficient replicators. Even multicellular organisms often die of cancer, because aligning mutable replicators is intractable in the long run.
The Law of Comparative Advantage won’t save you, because it assumes that the more productive and efficient entity can’t just be copy-pasted to replace all labor.
And most of history looks suffused with ruthless sociopathy to my eye.
This is the part that always confuses me about “alignment”, and it boils down to “aligned to who?”
The AI lab CEOs? I wouldn’t trust most of them with control of a superhuman intelligence. On his best day, Dario Amodei looks like the protagonist of a Greek tragedy about to be destroyed by the gods. And I wouldn’t trust Sam Altman with my lunch money.
The government? I’m sure the government can be trusted with control of a superhuman intelligence. /s
The voting public? I’m sure this is really fun, unless you’re trans, or an immigrant, or belong the Out Group. In which case, an AI aligned to the popular vote means you’re going to get “cured” of whatever society doesn’t like this year.
Now, I happen to personally believe that alignement of superintelligent, learning, goal-seeking entities is impossible. Not “difficult” or “it might take decades”, but flat out impossible. An AI might like humans enough to keep us as pets, but that will be the AI’s decision, not ours. Dogs have approximately no control over their relationship with humans, and I figure that “humans as house pets” is the absolute best possible result of building superintelligent AI. My P(not doom|someone builds superintelligence) is about 1⁄6, and nearly all of that 1⁄6 is placed on “humans as house pets.”
But if we could control AIs? Those AIs would be controlled by powerful humans, the same sorts of people who had warm personal relationships with Epstein, and who had zero problems with Epstein trafficking and raping children. Given a choice between “superhuman AIs aligned to the Epstein class” and getting paperclipped, I’d go with the paperclips.
The only winning move is not to build superintelligence.
Interesting!
I’m reminded of G.K. Chesterton’s (the fence guy’s) political philosophy: Distributivism. If I wanted to oversimplify, distributivism basically says, “Private property is such a good idea that everyone should have some!” Distributivism sees private property in terms of individual personal property: a farm, perhaps a small business, the local pub. It’s in favor of all that. You should be able to cut down your own tree, or build a shed, or work to benefit your family. There’s a strong element of individual liberty, and the right of ordinary people to go about their lives. Chesterton also called this “peasant proprietorship.”
But when you get to a larger scale, the scale of capital or of the great rentiers, Chesterton is ruthlessly willing to subjugate everything else to the goal of preserving ordinary, dignified human lives. In his time, there was a proposal to control lice by shaving the heads of poorer children. Leaving aside Chesterton’s notion of gender roles, his response to this was emphatic:
Now the whole parable and purpose of these last pages, and indeed of all these pages, is this: to assert that we must instantly begin all over again, and begin at the other end. I begin with a little girl’s hair. That I know is a good thing at any rate. Whatever else is evil, the pride of a good mother in the beauty of her daughter is good. It is one of those adamantine tendernesses which are the touchstones of every age and race. If other things are against it, other things must go down. If landlords and laws and sciences are against it, landlords and laws and sciences must go down. With the red hair of one she-urchin in the gutter I will set fire to all modern civilization. Because a girl should have long hair, she should have clean hair; because she should have clean hair, she should not have an unclean home: because she should not have an unclean home, she should have a free and leisured mother; because she should have a free mother, she should not have an usurious landlord; because there should not be an usurious landlord, there should be a redistribution of property; because there should be a redistribution of property, there shall be a revolution. That little urchin with the gold-red hair, whom I have just watched toddling past my house, she shall not be lopped and lamed and altered; her hair shall not be cut short like a convict’s; no, all the kingdoms of the earth shall be hacked about and mutilated to suit her. She is the human and sacred image; all around her the social fabric shall sway and split and fall; the pillars of society shall be shaken, and the roofs of ages come rushing down, and not one hair of her head shall be harmed.
That’s a creed, right there: “With the red hair of one she-urchin in the gutter I will set fire to all modern civilization.” Chesterton isn’t even quite right about lice control (what you needed in his day was a very fine comb and enough free time to brush your children’s hair daily, not necessarily a clean home as such). But the core idea stands.
Chesterton went on to explain he would prefer to be a gradualist, not a revolutionary, if gradualism would get the job done:
III. ON PEASANT PROPRIETORSHIP
I have not dealt with any details touching distributed ownership, or its possibility in England, for the reason stated in the text. This book deals with what is wrong, wrong in our root of argument and effort. This wrong is, I say, that we will go forward because we dare not go back. Thus the Socialist says that property is already concentrated into Trusts and Stores: the only hope is to concentrate it further in the State. I say the only hope is to unconcentrate it; that is, to repent and return; the only step forward is the step backward.
But in connection with this distribution I have laid myself open to another potential mistake. In speaking of a sweeping redistribution, I speak of decision in the aim, not necessarily of abruptness in the means. It is not at all too late to restore an approximately rational state of English possessions without any mere confiscation. A policy of buying out landlordism, steadily adopted in England as it has already been adopted in Ireland (notably in Mr. Wyndham’s wise and fruitful Act), would in a very short time release the lower end of the see-saw and make the whole plank swing more level. The objection to this course is not at all that it would not do, only that it will not be done. If we leave things as they are, there will almost certainly be a crash of confiscation. If we hesitate, we shall soon have to hurry. But if we start doing it quickly we have still time to do it slowly.
This point, however, is not essential to my book. All I have to urge between these two boards is that I dislike the big Whiteley shop, and that I dislike Socialism because it will (according to Socialists) be so like that shop. It is its fulfilment, not its reversal. I do not object to Socialism because it will revolutionize our commerce, but because it will leave it so horribly the same.
Chesterton’s objection to the socialism of his day was that it was essentially “the State as Walmart,” a giant centralization of economic effort and control. And he was suspicious of this.
But if you squint, Distributivism isn’t really a fully fledged economic philosophy at all. It doesn’t have a lot to say about the wealth created by mass production, or about trade, or about a hundred other things. What Distributivism (“peasant proprietorship”) really is, is a set of constraints. Do ordinary people own personal property? Do they have leisure time, and enough wealth for basic luxuries? Do they have enough time to parent their children well? Is society structured around the needs or ordinary people? Then you’re probably doing OK. But if everyone is stressed, and struggling, and has no time for their children, and cannot afford a decent place to live, well, something has gone wrong. And the underlying problem should be fixed gradually, if possible. But it should be fixed. And if a revolution is the only way to get there, well, so be it, in Chesterton’s eyes.
I am something like a Democratic Socialist, a gradualist who believes in a “mixed economy,” with all the space in the world for small proprietors and entrepreneurs and personal property. Capital is necessary, too! But capital is ultimately subject to the need of ordinary people to live decent lives. And if capital becomes destructive, and if the lives of ordinary people become burdensome, well, then we should change the rules around capital. I would vastly prefer to do this democratically and gradually and without great disruption, taking the smallest steps that will fix the problem. Chesterton, after all, also had his famous fence. But if I am forced to choose between the well-being of a “she-urchin in the gutter,” and all the self-important infrastructure of the modern economy? The well-being of ordinary families is ultimately non-negotiable.
(In the modern era, I am also very much in favor of building, because the lack of decent houses has become burdensome to ordinary people. And we need more electricity and better transportation, and so we also need to build at a greater scale, via whatever mechanisms are practical. But I am ultimately in favor of these things because they would improve the lives of ordinary people. Capital is a tool, and even an important one. But if the tool puts itself in opposition to ordinary people having decent lives, then I know how I will choose.)
This idea predates Yudkowsky by quite a bit, actually!
For the idea of a folie à deux between a human and an AI, there’s always Alfred Bester’s classic “Fondly Farenheit” (1954, content note: murder), which opens with one of the best lines in science fiction:
For the more general type of AI-powered pursuasion, Vernor Vinge and Charles Stross wrote early stories where superintelligence “rewrote” human minds. Here’s Vinge in A Fire Upon the Deep (1992!). A character explains why smart people don’t mess with superintelligence (emphasis added):
Then we have Charles Stross, in “Antibodies” (2000). Here, police officers are cognitively subverted by a nascent superintelligence (that has shown that all NP problems are in P, and picked up the expected superpowers):
The mechanism here is an optimized visual attack designed to efficiently subvert the brain:
These days, I regularly feel like I’ve encountered those AI-compromised “zombies” recently.
Vernor Vinge revisits the idea of superhuman persuasion in Rainbows End (2006):
Here, there is fear that some actor—a terrorist group, a rogue AI—had superhuman pursuasive technology.
It’s worth noting that these ideas substantially predate Yudkowsky’s warnings against superintelligence. In particular, the superintelligence in A Fire Upon the Deep (1992) is almost literally, to this day, the threat model behind If Anyone Builds It, Everyone Dies. This isn’t to invalidate Yudkowsky’s warnings: I think Vinge was right that anyone foolish enough to build superhuman minds risks losing control rapidly and having a very bad day, for much the same reason that adults frequently outsmart toddlers.
But some of us have been worried about this stuff for almost a quarter of a century now. Around 2007 or so, I originally expected things to start getting scary around 2025, mostly by extrapolating out Moore’s Law. By 2017, I breathed a sigh of relief: We’d made progress in AI, yes, but we didn’t seem to be on track for working machine intelligence any time soon. Since then, we made up the lost ground at breakneck speed.
Yudkowsky worked hard to warn people. But the potential threat of superintelligence was taken seriously by people before him.