We had Golden Gate Claude, now we have White Genocide Grok…
Mitchell_Porter
This seems like a Chinese model for superintelligence! (All the authors are Chinese, though a few are working in the West.) Not in the AIXI sense of something which is optimal from the beginning, but rather something that could bootstrap its way to superintelligence. One could compare it to Schmidhuber’s Godel machine concept, but more concrete, and native to the deep learning era.
(If anyone has an argument as to why this isn’t a model that can become arbitrarily intelligent, I’m interested.)
If I have understood correctly, you’re saying that OpenAI should be forecasting greater revenue than this, if they truly think they will have AIs capable of replacing entire industries. But maybe they’re just being cautious in their forecasts?
Suppose I have a 3d printing / nanotechnology company, and I think that a year from now I’ll have an unlimited supply of infinity boxes capable of making any material artefact. World manufacturing is worth over US$10 trillion. If I thought I could put it all out of business, by charging just 10% of what current manufacturers charge, I could claim expected revenue of $1 trillion.
Such a prediction would certainly be attention-grabbing, but maybe it would be reckless to make it? Maybe my technology won’t be ready. Maybe my products will be blocked from most markets. Maybe someone will reverse-engineer and open-source the infinity boxes, and prices will crash to $0. Maybe I don’t want the competition or the government to grasp just how big my plans are. Maybe the investors I want wouldn’t believe such a scenario. There are a lot of reasons why a company that thinks it might be able to take over the economy or even the world, would nonetheless not put that in its prospectus.
OpenAI gets a lot of critical attention here, because it’s been the leader in many ways. But what about Google AI? Many people think it’s the leader now, yet I don’t see anywhere near as much critical scrutiny of its decision-making process.
That gives “Eternal September” a new meaning…
The view that Heisenberg advocates—reductionism had reached a limit, and a new paradigm was needed—was a highly influential school of thought in the 1960s. In particle physics, there is a mathematical object called the S-matrix (scattering matrix), which tabulates scattering amplitudes (the quasiprobability that if these N particles enter a collision, these other M particles will be what comes out). Quantum electrodynamics (a theory with electrons and photons, let’s say) is a prototypical quantum field theory in which the S-matrix can be calculated from the stipulation that electrons and photons are fundamental. For the weak interactions (later unified with electromagnetism), this reductionist method also works.
But for the strong interactions, field theory looked intractable, and a new philosophy was advanced that the S-matrix itself should be the central mathematical object in the theory. Remember that quarks were never seen by themselves, only protons, neutrons, and a hundred other types of “hadron”. The idea of nuclear democracy was that the S-matrix for these hundred seemingly equi-fundamental particle species, would be derived from postulates about the properties of the S-matrix, rather than from an underlying field theory. This was called the bootstrap program, it is how the basic formulae of string theory were discovered (before they had even been identified as arising from strings), and it’s still used to study the S-matrix of computationally intractable theories.
These days, the philosophy that the S-matrix is primary, still has some credibility in quantum gravity. Here the problem is not that we can’t identify ultimate constituents, but rather that the very idea of points of space-time seems problematic, because of quantum fluctuations in the metric. The counterpart of the 1960s skepticism about quarks, would be that the holographic boundary of space-time is fundamental. For example, in the AdS/CFT correspondence, scattering events in Anti de Sitter space (in which particles approach each other “from the boundary”, interact, and then head back to the boundary) can be calculated entirely within the boundary CFT, without any reference to AdS space at all, which is regarded as emergent from the boundary space. The research program of celestial holography is an attempt to develop the same perspective within the physically relevant case of flat space-time. The whole universe that we see, would be a hologram built nonlocally from entanglement within a lower-dimensional space…
The eventual validation of quarks as particles might seem like a sign that this radical version of the holographic philosophy will also be wrong in the end, and perhaps it will be. But it really shows the extent to which the late thoughts of Heisenberg are still relevant. Holographic boundaries are the new S-matrix, they are a construct which has made quantum gravity uniquely tractable, and it’s reasonable to ask if they should be treated as fundamental, just as it was indeed entirely reasonable for Heisenberg and the other S-matrix theorists to ask whether the S-matrix itself is the final word.
Ugh, I was using LW’s custom reaction emoticons to annotate this comment, and through a fumble, have ended up expressing a confidence of 75% in the scenario that AI will “otherwise enforce on us a pause”, and I don’t see how to remove that annotation.I will say that, alignment aside, the idea that an advanced AI will try to halt humanity’s AI research so it doesn’t produce a rival, makes a lot of sense to me.
Ted Cruz mentioned how his daughter was using ChatGPT when texting him. I wonder how many of these senators and CEOs, and their staffers and advisors, are already doing the same, when they try to decide AI policy. I guess that would be an example of weak-to-strong superalignment :-)
Heisenberg versus quarks is one of the best lifelong physics thinkers, encountering one of the most subtle topics in physics. When he says
The questions about the statistics of quarks, about the forces that hold them together, about the particles corresponding to these forces, about the reasons why quarks never appear as free particles, about the pair-creation of quarks in the interior of the elementary particle—all these questions are more or less left in obscurity.
… he is raising all the right questions, and their resolution required a field theory with completely unprecedented behaviors. The statistics of quarks required a new threefold quantum property, “color”; the forces that hold them together were described by a new kind of quantum field, a strongly coupled Yang-Mills field; the particles corresponding to those forces are the gluons, and like quarks, they never appear as free particles, because of a new behavior, “confinement”.
The deeper story here is a struggle of paradigms in the 1960s, between the old search for the most elementary particles, and the idea of a “nuclear democracy” of transformations among a large number of equally fundamental particle species. We now see those many particle types as due to different combinations of quarks, but at the time no-one understood how quarks could literally be particles (for the reasons touched on by Heisenberg), and they were instead treated as bookkeeping devices, akin to conserved quantities.
The quark theorists won, once the gauge theory of gluons was figured out; but nuclear democracy (in the guise of “S-matrix theory”, pioneered by Heisenberg) left its mark too, because it gave rise to string theory. They aren’t even completely distinct paradigms; there are very close relationships between gauge theory and string theory, though not close enough that we know exactly what string theory corresponds to the quarks and gluons of the standard model.
Incidentally, is there any meaningful sense in which we can say how many “person-years of thought” LLMs have already done?
We know they can do things in seconds that would take a human minutes. Does that mean those real-time seconds count as “human-minutes” of thought? Etc.
Liron: … Turns out the answer to the symbol grounding problem is like you have a couple high dimensional vectors and their cosine similarity or whatever is the nature of meaning.
Could someone state this more clearly?
Jim: … a paper that looked at the values in one of the LLMs as inferred from prompts setting up things like trolley problems, and found first of all, that they did look like a utility function, second of all, that they got closer to following the VNM axioms as the network got bigger. And third of all, that the utility function that they seemed to represent was absolutely bonkers
What paper was this?
scientifically falsifiable
How is it falsifiable?
I think it’s very good to have people around who are saying “cut back on social media”, “get off social media”, as a counter to its addictive power.
And yet… If I take the title literally, I am being told that I should quit social media entirely, as soon as possible, because in the near future, it will be so addictive that I will be literally unable to quit.
When you first raised this idea, I asked what will happen to people who don’t get out in time? In this post, we now have a concrete scenario. The protagonist doesn’t die. They don’t go mad. They don’t become anyone’s minion. They just… spend a lot of time irritated, spend all day watching videos, and lose touch with some real people.
Well, that’s nobody’s ideal, but it’s not actually worse than the human condition has been, for large numbers of people throughout history. “Lives of quiet desperation” I think have been pretty common in the agricultural and industrial eras. In terms of historical experience, it is actually normal for people to end up limping through life with some affliction that they never quite get over, whether it’s injury, trauma, some familial or national destiny that they just can’t escape… To learn that, in the information age, some people become unwholesome computer or media addicts, is just to write the next chapter of that.
Let me be clear, I’m not quite urging apathy about social media addiction. It’s just that I was expecting something more apocalyptic as the payoff, that humanity would be utter captives of the content farms, perhaps later to be herded into Matrix pods or assembled into armies of meme-controlled zombies. Instead, what you’re describing is more like a chronic misery specific to the information age.
It’s like someone warning that if you abandon hunting and gathering, you’ll end up standing around all day watching farm animals, or if you leave the farm for the big industrial city, you’ll end up stooped and maimed by factory work. All that actually happened, but there were also huge upsides to the new order in each case.
After all, there’s actually a lot of good stuff that comes through social media. With a keyword search, I can find up-to-the-second information and perspectives, on something that is happening, including censored perspectives. I can follow the news about topics that interest only a very niche audience. I can eavesdrop on, and even participate in, all kinds of discussions that would otherwise be out of my reach. I can track down lost friends, find work, simply indulge my curiosity.
Of course there are formidable downsides too. You can overindulge (I have a weakness for reaction videos), you can burn out certain faculties, you can forget your own life amidst a million distractions, and just as in real life, there are far worse things lying in wait: scammers, grifters, toxic personalities and communities, political botnets; and perhaps there are even architects of addiction who deserve death as much as any fentanyl dealer does.
It’s just that you haven’t really made the case, that the social Internet will become nothing but a prison of blighted lives. Life in that space is much more like living in a city. There are risks and evils specific to urban life, and city dwellers must learn to avoid them, and lots of people fall prey to them. But there are also good things that can only happen in cities, and there are new good things that only happen online.
I haven’t mentioned the AI factor so far, but it is central to your scenario. My response again is that there are positives and negatives, and in some cases it may not even be clear which is which, The combination of AI and social media may lead to new symbioses that look horrifying to outsiders, but which have a merit and integrity of their own. As AI becomes more and more capable, the question of AI on social media just blends into the broader question of humanity’s destiny in a world with AI, and ultimately, a world with artificial superintelligence.
How often will a civilization with the capability to perform such a simulation, have anything to learn from it?
Yes, thanks. And someone should do the same analysis, regarding coverage of AI 2027 in American/Western media. (edit: A quick survey by o3)
spreading the idea of “heroic responsibility” seems, well, irresponsible
Is this analogous to saying “capabilities research is dangerous and should not be pursued”, but for the human psyche rather than for AI?
Your comment has made me think rather hard on the nature of China and America. The two countries definitely have different political philosophies. On the question of how to avoid dictatorship, you could say that the American system relies on representation of the individual via the vote, whereas the Chinese system relies on representation of the masses via the party. If an American leader becomes an unpopular dictator, American individuals will vote them out; if a Chinese leader becomes an unpopular dictator, the Chinese masses will force the party back on track.
Even before these modern political philosophies, the old world recognized that popular discontent could be justified. That’s the other side of the mandate of heaven: when a ruler is oppressive, the mandate is withdrawn, and revolt is justified. Power in the world of monarchs and emperors was not just about who’s the better killer; there was a moral dimension, just as democratic elections are not just a matter of who has the most donors and the best public relations.
Returning to the present and going into more detail, America is, let’s say, a constitutional democratic republic in which a party system emerged. There’s a tension between the democratic aspect (will of the people) and the republican aspect (rights of the individual), which crystallized into an opposition found in the very names of the two main parties; though in the Obama-Trump era, the ideologies of the two parties evolved to transnational progressivism and populist nationalism.
These two ideologies had a different attitude to the unipolar world-system that America acquired, first by inheriting the oceans from the British empire, and then by outlasting the Russian communist alternative to liberal democracy, in the ideological Cold War. For about two decades, the world system was one of negotiated capitalist trade among sovereign nations, with America as the “world police” and also a promoter of universal democracy. In the 2010s, this broke down as progressivism took over American institutions, including its external relations, and world regions outside the West increasingly asserted their independence of American values. The appearance of populist nationalism inside America makes sense as a reaction to this situation, and in the 2020s we’re seeing how that ideology acts within the world system: America is conceived as the strongest great power, acting primarily in the national interest, with a nature and a heritage that it will not try to universalize.
So that’s our world now. Europe and its offshoots conquered the world, but imperialism was replaced by nationalism, and we got the United Nations world of several great powers and several hundred nations. America is the strongest, but the other great powers are now following their own values, and the strongest among the others is China. America is a young offspring of Europe on a separate continent, modern China is the latest iteration of civilization on its ancient territory. The American political philosophy is an evolution of some ancient European ideas; the Chinese political philosophy is an indigenous adaptation of an anti-systemic philosophy from modern Europe.
One thing about Chinese Marxism that is different from the old Russian Marxism, is that it is more “voluntarist”. Mao regarded Russian Marxism as too mechanical in its understanding of history; according to Mao, the will of the people and the choices of their political leadership can make a difference to events. I see an echo of this in the way that every new secretary of the Chinese Communist Party has to bring some new contribution to Marxist thought, most recently “Xi Jinping Thought”. The party leader also has to be the foremost “thought leader” in Chinese Marxism, or they must at least lend their name to the ideological state of the art (Wang Huning is widely regarded as the main Chinese ideologist of the present). This helps me to understand the relationship between the party and the state institutions. The institutions manage society and have concrete responsibilities, while the party determines and enforces the politically correct philosophy (analogous to the role that some now assign to the Ivy League universities in America).
I’ve written all this to explain in greater detail, the thinking which I believe actually governs China. To just call China an oppressive dictatorship, is to miss the actual logic of its politics. There are certainly challenges to its ideology. For example, the communist ideology was originally meant to ensure that the country was governed in the interest of the worker and peasant classes. But with the tolerance of private enterprise, more and more people become the kind of calculating individual agent you have under capitalism, and arguably representative democracy is more suited to such a society.
One political scientist argues that ardor for revolutionary values died with Mao, leaving a void which is filled partly by traditional values and partly by liberal values. Perhaps it’s analogous to how America’s current parties and their ideologies are competing for control of a system that (at least since FDR) was built around liberal values; except that in China, instead of separate parties, you have factions within the CCP. In any case, China hasn’t tilted to Falun Gong traditionalism or State Department democratization, instead Xi Jinping Thought has reasserted the centrality of the party to Chinese stability and progress.
Again, I’m writing this so we can have a slightly more concrete discussion of China. There’s also a bunch of minor details in your account that I believe are wrong. For example, “Nationalist China” (the political order on the Chinese mainland, between the last dynasty and the communist victory) did not have regular elections as far as I know. They got a parliament together at the very beginning, and then that parliament remained unchanged until they retreated to Taiwan (they were busy with famines, warlordism, and the Japanese invasion); and then Taiwan remained a miitary-run regime for forty years. The Uighurs are far from being the only significant ethnic group apart from the Han, there are several others of the same size. Zhang Yiming and Rubo Liang are executives from Bytedance, the parent company of Tiktok (consider the relationship between Google/Alphabet and YouTube); I think Zhang is the richest man in China, incidentally.
I could also do more to explain Chinese actions that westerners find objectionable, or dig up the “necessary evils” that the West itself carries out. But then we’d be here all day. I think I do agree that American society is politically friendlier to the individual than Chinese society; and also that American culture, in its vast complexity, contains many many valuable things (among which I would count, not just notions like rule of law, human rights, and various forms of respect for human subjectivity, but also the very existence of futurist and transhumanist subcultures; they may not be part of the mainstream, but it’s significant that they get to exist at all).
But I wanted to emphasize that China is not just some arbitrary tyranny. It has its freedoms, it has its own checks and balances, it has its own geopolitical coalitions (e.g. BRICS) united by a desire to flourish without American dependence or intervention. It’s not a hermit kingdom that tunes out the world (witness, for example, the frequency with which very western-sounding attitudes emerge from their AIs, because of the training data that they have used). If superintelligence does first emerge within China’s political and cultural matrix, it has a chance of being human-friendly; it will just have arrived at that attractor from a different starting point, compared to the West.
Some of the recent growing pains of AI (flattery, selfish rule-breaking) seem to be reinventing aspects of human nature that we aren’t proud of, but which are ubiquitous. It’s actually very logical that if AIs are going to inhabit more and more of the social fabric, they will manifest the full spectrum of social behaviors.
OpenAI in particular seems to be trying to figure out personality, e.g. they have a model called “Monday” that’s like a cynical comedian that mocks the user. I wonder if the history of a company like character.ai, whose main product is AI personality, can help us predict where OpenAI will take this.
I can imagine an argument analogous to Eliezer’s old graphic illustrating that it’s a mistake to think of a superintelligence as Einstein in a box. I’m referring to the graphic where you have a line running from left to right, on the left you have chimp, ordinary person, Einstein all clustered together, and then far away on the other side, “superintelligence”, the point being that superintelligence far transcends all three.
In the same way, the nature of the world when you have a power that great is so different that the differences among all human political systems diminish to almost nothing by comparison, they are just trivial reorderings of power relations among beings so puny as to be almost powerless. Neither the Chinese nor the American system is built to include intelligent agents with the power of a god, that’s “out of distribution” for both the Communist Manifesto and the Federalist Papers.
Because of that, I find it genuinely difficult to infer from the nature of the political system, what the likely character of a superintelligence interested in humanity could be. I feel like contingencies of culture and individual psychology could end up being more important. So long as you have elements of humaneness and philosophical reflection in a culture, maybe you have a chance of human-friendly superintelligence emerging.
Somehow this has escaped comment, so I’ll have a go. I write from the perspective of whether it’s suitable as the value system of a superintelligence. If PRISM became the ethical operating system of a posthuman civilization born on Earth, for as long as that civilization managed to survive in the cosmos—would that be a satisfactory outcome?
My immediate thoughts are: It has a robustness, due to its multi-perspective design, that gives it some plausibility. At the same time, it’s not clear to me where the seven basis worldviews come from. Why those seven, and no others? Is there some argument that these seven form a necessary and sufficient basis for ethical behavior by human-like beings and their descendants?
If I dig a little deeper into the paper, the justification is actually in part 2. Specifically, on page 12, six brain regions and their functions are singled out, as contributing to human decision-making at increasingly abstract levels (for the hierarchy, see page 15). The seven basis worldviews correspond to increasing levels of mastery of this hierarchy.
I have to say I’m impressed. I figured that the choice of worldviews would just be a product of the author’s intuition, but they are actually grounded in a theory of the brain. One of the old dreams associated with CEV, was that the decision procedure for a human-friendly AI would be extrapolated in a principled way from biological facts about human cognition, rather than just from a philosophical system, hallowed tradition, or set of community principles. June Ku’s MetaEthical AI, for example, is an attempt to define an algorithm for doing this. Well, this is a paper written by a human being, but the principles in part 2 are sufficiently specific, that one could actually imagine an automated process following them, and producing a form of PRISM as its candidate for CEV! I’d like @Steven Byrnes to have a look at this.