AI as a Civilizational Risk Part 2/6: Behavioral Modification

Altering people for the worse using AI is easier than altering them for the better.

Redefining what “Capital” means

Narrow AIs and optimization algorithms of social media news feeds have already caused many adverse outcomes without being AGIs. When people talk about AI on Twitter or Less Wrong, they use manufacturing or job-replacement analogies as the core function of AI. This analogy imagines AI as a powerful extension of industrial capital, defined by its “ownership of the means of production.” This 19/20th-century thought is becoming increasingly removed from the reality of capital today.

The core business of the largest tech companies, whether social media giants, advertising behemoths like Google, or middle-layer companies like Amazon, is not about production.

It is about behavioral modification.

This behavioral modification takes many forms: from digital addiction to engagement-driven news feed in the case of social media. Advertising funds much of the technology world and has become a hyper-optimized system for getting people to buy or engage with products. Usually, the original focus of behavioral modification is habit formation to use the particular company’s site. Initial habit formation is innocent enough if done by cleaning up the UI and fixing issues. However, many of these goals eventually shift into the over-optimization territory.

In the 21st century, the term “Capital” has effectively changed meaning to become less about the means of production and more about the means of behavior modification. “Age of Surveillance Capitalism” has written about this change of meaning. Historically, behavioral modification has been mostly a function of governments, religions, or media. Large-scale involvement of capital in behavioral modification puts companies into traditionally occupied roles by those institutions, which creates predictable conflict and mutual attempts to control each other. Companies frequently have governmental involvement, while corporate lobbying has also increased.

Viewing AI as an extension of capital is a good perspective. However, its manufacturing side is less dangerous and less impactful than the behavioral modification side of technocapital.

The leading carrier of behavioral modification problems is social media. Social media has converged on a horrifying equilibrium of optimizing for engagement. This convergence on “optimizing for engagement” did not have to be the case, but it happened. As a result, social media algorithms do not look into whether the discourse is uplifting or true, encouraging, worthy of attention, or socially cohesive. Attempts to combat “misinformation” can frequently backfire, but even if they were good, they are patches on the underlying utility function failure.

Sometimes the most controversial ideas, which do not contribute to the advancement of knowledge but tend to cause the most division within a nation, get promoted. In politics, these are known as “wedge issues.” They are used to divide an electorate to concentrate your supporters away from society or to drive a wedge within an opponent’s base to decrease their social cohesion. They may temporarily serve a particular politician’s interest, but they lower social cohesion of society as a whole. Designation of something as a “wedge issue” is itself a contentious topic, but they are as old as empires themselves. Even meta-wedge issues, such as whether appeals to the public (“populism”) are good or bad for a republic, are as old as the Roman empire. (See Optimates and Populares—Wikipedia). Twitter runs an algorithm specifically for finding wedge issues and systematizing the destruction of trust. This tendency towards controversy has been noted by many people, for example. SSC Sort By Controversial | Slate Star Codex

Social Media vs Search Engines

Social media feed ranking “algorithms” are not necessarily neural networks, and they might not be called AI in a traditional sense. They could be full of “if” statements, but I lump them together with narrow AI because they are carefully A/B tested to improve some metric. In other words, social media algorithms have an “optimization nature,” which we need to worry about in narrow AIs, whether it takes the form of deep learning or other software.

I have worked for both social media companies and search engines. Social media is drastically different from search engines in the utility function it measures itself by and optimizes. Social media optimizes you to spend as much time on the site as possible. Search engines, which are far better for mental health and potentially improve your mental health if you use them, are optimizing for spending as little time on the site as possible. If you click a link as quickly as you can on Google or Bing and do not come back to the site within the same session, they consider this a massive win for the link and themselves.

Two different types of websites have two metrics with nearly opposite utility functions. As you might expect, they have near-opposite effects on the users of their sites.

Search engine use is positively correlated with increased brain function. Social media use, especially heavy passive use, is negatively correlated with mental health. Zuckerberg even mentions this relationship between passive use and poor mental health in congressional testimony. This unfortunate situation has arisen partly due to a lack of concern for narrow AI or highly optimized A/B testing safety. Even though Facebook has admitted to social media’s adverse effects on some users, the algorithms tend to change through patchwork fixes rather than underlying utility function fixes or decentralization. First, Facebook fought “game invite spam,” then “clickbait,” then “fake news,” then “misinformation.” Earlier fixes were correct, but the later ones became questionable and over-zealous. However, the underlying engagement maximization is likely to persist underneath the patchwork. Facebook’s recent push for “time well spent” is probably better than before but is still a far cry from the likely more correct “least time spent” search engine ranking approach.

The design of platforms, such as TikTok and Twitter, is even worse because it makes parasociality the default mode of interaction, extending the failed logic of the “Age of Celebrity.” Parasociality is not truly a natural state of relationships and is a problem by itself. This design creates further difficulty in implementing proper ranking solutions.

Imagine that social media took a different path. Social media could have become a subscription-based service with no ads. This design decision would have drastically altered the cost of Western civilization. Social media could have gone with ads but designed their corporate structure to be more similar to search engines. Social media appeared after the search engines, so this is entirely possible. Treating social media more like search would imply creating two different teams handling ranking and ads, a non-engagement-based metric, and valuing quality from the beginning. It would not have been perfect, but it would likely be much better. Search engine corporate structure is a good set of ideas; many were in use then. If this had happened, the intensity of the culture war and politicization of everything would have been lower. Many of the adverse side effects of the culture war, including the current escalations of the great-power conflict, could have been avoided.

These kinds of issues are only going to escalate. Social media behavior modification aims to get you addicted to their platform while not necessarily providing the amount of benefit given the time you spend on it. Addiction, waste, and even individual mental health problems are bad enough, but lowering social cohesion is the core driver of civilizational risk. Historical priors and evidence from part 1 suggest we are already in a low social cohesion civilization and any further decreases are a civilizational risk.

Government Involvement in Behavioral Modification

However, algorithmic wedge issue amplification and engagement-driven addiction are not the only issues. In the next ten years, we will also have problems with governments using AIs as a behavioral control mechanism. The most likely two goals are

1. de-radicalization and avoiding rebellion and violence in general

2. radicalization against perceived enemies of the regime and encouragement of violence

They may contradict each other, but since different parts of the government implement them, both could be in play.

The first goal of avoiding rebellion may seem reasonable. However, without appropriate safety guarantees, all goals may backfire. Governments have historically worked to prop up regime legitimacy and decrease the legitimacy of anyone trying to question the regime. In the future, this can take the form of AI-based detection of dissidents through metadata or sentiment and suppression of their opinions through pressuring tech companies. Behavioral modification by the government can also be GPT-like bots shouting slurs at people insufficiently supportive of regime policies or spamming social media with bad arguments. People might post dissenting opinions less due to fear of losing reach or feeling hated. If the dissidents do not know bots attack them, they are likely to grow resentful of their fellow citizens who disagree with them, setting the stage for further conflict. Suppression of dissidents and gas-lighting of the population regarding what it thinks happened all the time in the past. However, the process will become more streamlined due to narrow AI.

People in different countries evaluate this use case differently. Most people in the world have a negative view of rebellion in their countries. They frequently view rebellion positively in countries they do not like.

However, even if you view rebellion in your country as unfavorable, the government’s methods to minimize discontent will have massive side effects. Governments could swarm online discourse with enough bots pretending to be human to alter the perception by the population of what the population thinks. Public changes to the distribution of angry content can lead to an epidemic of demoralization, depression (also known as “blackpilling”), or people thinking they are truly alone in thinking something perfectly reasonable. Some of this may already be happening. The government’s desire to use it increases as narrow AI develops in power. Danger to social cohesion does not require that any individual AI is particularly good at converting all people, merely attenuating and de-attenuating specific topics can give the government the effect they desire at the cost of everyone’s sanity.

You can imagine a government bot propped up by other bot accounts that immediately criticized somebody’s good idea that happens to challenge a feature of the government. Taken to an extreme, this would squash any reasonable ideas of what a good government is. While you may view the “absence of rebellion” as positive, the governments accomplish this by reducing social cohesion even further and driving more people into despair.

An even more dangerous idea is that certain parts of the government want to antagonize people to commit violence. Increasing violence in your population sounds extremely strange, but certain government institutions have metrics on how many terrorists or criminals they catch. Suppose the current numbers of criminals somehow fail to fulfill these metrics. In that case, the departments in question will likely try to radicalize people online to “catch” them. What is concerning is the future use of AIs in radicalization. Sufficiently A/B tested deep learning algorithms, even without being an AGI, can pick out a few people out of a thousand and push them to commit some crime to be entrapped. Once again, this is dangerous because it may further erode social cohesion and destroy trust among people. Many of the above are known as “psyops” in internet slang. While collecting reliable information on this is hard, these operations will likely become automated in the next few years. People beginning to suspect “feds” embedded in groups and not being super trusting of strangers is a bad sign for society.

Another alarming situation is when the government tries to use extralegal means of killing or intimidating people they do not like. One example could be radicalizing their neighbors and asking them to commit violence against them and their property. One can imagine using deepfakes and persuasion AIs to rally crowds around a perceived injustice that gets them to destroy the property of people whom governments view as undesirable. If the government wishes to intimidate the population, it could cause a repeat of the property destruction of 2020.

Media has poor incentives somewhere between government and social media companies. Media, being part of the regime, wishes to maintain the regime’s legitimacy while also wanting to make money from clicks and controversy. Media is not known to use AI a lot, but they do have engagement optimization A/B testing. Finding controversy where none exists, such as by making up stories about tech personalities, also creates wedge issues and lowers social cohesion. Adding one example, but there are too many to list.

A hypothetical company could write a bunch of AIs to convince people to verbally attack those who criticize its products online. It could blockade previously helpful channels of information such as reviews. Online verbal attacks also lower social cohesion, even if some individuals can learn to “shrug them off.”

The above is a taste of how prominent players can push on behavioral modification buttons while ignoring the side effects. The temptation to use narrow AI to boost one’s legitimacy through polluting the epistemic commons will only get stronger. Each country’s corresponding regime and its opponents will accuse each of “fake news.” Telling apart whose accusation is correct or not is a non-trivial task.

Great power competition also enters the picture, given that the internal politics of almost all countries might degrade.

We are likely to see some failing attempts by countries to unify themselves by creating an external enemy. Inciting great power animosity is perceived by some elites as being able to trade external social cohesion for internal social cohesion. This thinking is another leftover of the 20th century. Social cohesion in the West is so low that fights with external enemies introduce another wedge issue. Once the government decides to amplify an external nation as an enemy, then the usage of AIs in behavior modification becomes more extreme. Each country will use AI to try to radicalize inhabitants of their own countries and demoralize inhabitants of other nations. Once a nation reaches a threshold of animosity, it will attack its adversaries’ social cohesion algorithmically, the main weak point being social media ranking.

An example delivery vector of a plausible “algorithmic cultural attack” is TikTok. TikTok has many issues. It is highly addictive. It seems to give people visible mental health problems, including strange bodily movements. It is effectively a probe by China of American psychological and cyber defenses, and this probe has left America’s defenses wanting. America has shockingly not banned TikTok. America has yet to figure out how to measure the adverse mental health effects or cybersecurity implications of TikTok. This example exists in a great power competition in a China/USA environment that is not that hostile by historical standards. Even in such an environment, China probes America’s defenses using an AI-powered social media algorithm. Moreover, it is already having fairly significant negative effects on people in America. The effects will likely worsen if the environment becomes more hostile than today.

Behavioral modification is different from manufacturing or job replacement in terms of its safety profile. Many critiques or worries people raised regarding manufacturing, job replacement, and their associated economic equality and concentration of power are far more severe in behavioral modification scenarios.

Advertising, social media, or governmental AIs present, create an enormous inequality in people’s capacity for behavioral modification. In the past, parents shaped their children’s behavior. Today children are more likely to be shaped by online content and global techno capital than ever before. In particular situations, this is dangerous to young kids, which can already be subject to highly addicting online videos. These are infohazards for children, though not yet for adults. The government should have banned them a long time ago. However, as the ideas of what is suitable for children diverge between parents and engagement optimizing techno-capital, we are likely to see further clashes around parental rights and an ever-more-increasing desire for parents to commit violence in defense of their children.

Propaganda is not new. Nations using propaganda to control the population is not new, and propaganda containing dangerous lies, which destroy the capacity of the population to think or be socially cohesive, is also not new. However, the optimization power of propaganda will increase, given the capacity of AI to generate a false sense of reality online through many bots and GPT-like fake news articles. The hostility to populations of other great powers is likely to increase compared to previous wars. The desire to use extremely dangerous weapons is also likely to increase. Humanity does not have proper neutral arbiters to resolve these kinds of situations.

It is possible but improbable for the AI creators to put their foot down and block the government from using their AIs for this purpose. However, the probability of that is somewhat low. The governments have ways to pressure them to use the tools, and the creators themselves believe in the causes of their government. The creators are not immune to the propaganda of the previous generation of narrow AIs and fake news. Thus, even though the creation of AIs currently has humans in the loop, those humans are quite influenced by what they perceive as high status, primarily decided by what narrow AIs promote in Twitter and other discourses.

A situation of low social cohesion is already dangerous enough without involving AI. However, unaligned narrow AI has a negative feedback cycle with the overall decline of social cohesion. Software developers in socially incohesive civilizations are likely to have trouble agreeing on what type of software they want to develop. Thus, if they are to build something together, they can end up with bizarre coordination points. Instead of building software based on a philosophically grounded idea of what is good for their nation, which may be tricky to specify, they have trouble agreeing and thus optimize metrics that are easy to measure. They do not seem to care that those metrics might end up hurting their user because they do not feel solidarity with their fellow citizens or, in TikTok’s case, citizens of other countries. The less socially cohesive the underlying people are, the more their AI coordination points will diverge from the human utility. While social cohesion is likely necessary, more is needed for a safer AGI.

Civilizational risk, especially in the next ten years, comes primarily from using ever-increasing powers of narrow AIs in behavioral modification. This premise drastically influences the plausible historical trajectory of civilization on the path toward AGI. Understanding behavioral modification as a c-risk changes our plausible assessments of civilizational capacity, what kind of research we should prioritize, and what economic conditions, if any, will exist in the near term.

All parts

P1: Historical Priors

P2: Behavioral Modification

P3: Anti-economy and Signal Pollution

P4: Bioweapons and Philosophy of Modification

P5: X-risk vs. C-risk

P6: What Can Be Done

AI as a Civilizational Risk Part 2/​6: Behavioral Modification

AI as a Civilizational Risk Part 2/6: Behavioral Modification