Social media use probably induces excessive mediocrity

Disclaimer: This is valuable for understanding the AI industry and the landscape that it is a part of, and for surviving the decade. It is NOT an EA cause area, and should not distract from the overriding priority of AGI.

TL;DR

Part 1 (4 minutes):
Modern social media platforms, and people with backdoor access e.g. botnets, use AI not just to manipulate public opinion, but also to induce and measure a wide variety of measurable behavior, including altering emotions and values/​drives. There are overriding incentives to wire the platform and the users to induce akrasia, power-seeking behavior, and feel repulsed by complex thought, not just to keep people using the platform, but also to improve each user’s predictability and data quality, as this keeps them and their environment more similar to most of the millions people that they’ve collected data on.

Part 2 (4 minutes):
Social media users can also be deliberately used by powerful people to attack the AI safety community, offering a wide variety of creative attacks for attackers to select based on their preferences. As it currently stands, we would be caught with our pants down and they would get away with it, which incentivizes attacks.

Part 3 (6 minutes):
Various commentary on the implications of these systems, and our failure to spot this until it was too late.

Part 1: Optimizing for mediocrity

  1. Social media probably maximizes propensity to return to daily habits, and akratic ones in particular, as well as akratic tendencies and mindsets in general. These correlate strongly with measurably increased social media use and reduced quit rates, as social media use itself is akratic.

    1. Altruistic, intelligent, and agentic tendencies probably induce causality cascades just on their own, as they introduce unusual behavior and tear down schelling fences; large sample sizes of human behavior data is powerful, but individual human minds are still complex (especially elite groups), and data quality and predictiveness increase as more variables are controlled, temporarily making the mind more similar to the millions of other minds that data has already been collected on. At least, while the mind is in the social media news feed environment, which is the best opportunity for data collection due to giving every user a highly similar experience in a controlled environment.

    2. This also yanks people in the opposite direction of what CFAR was trying to do.

  2. Social media systems attempt to maximize the user’s drive to prioritize social reality over objective reality, as social reality drives people to care more about what happens on social media, measurably increasing social media use and reducing quit rates, whereas more focus on objective reality makes AI safety people more likely to optimize their life and measurably reduce social media use and increase quit rates.

    1. Maximally feeding the drive to accumulate and defend percieved social status, vastly in excess of the natural drive that the user would have otherwise experienced.

    2. There is also a mental health death spiral of being strongly pulled between objective reality and social reality, such as Valentine’s Here’s the exit post, or the original quokka thread on twitter.

  3. There is also the long-term effects of instant gratification, a topic which I know is not new here, and is groan-inducing due to the NYT Op-ed clowns that keep vomiting all over the topic, but it’s still worth noting that instant gratification makes people weak in the same way that dath ilan makes people strong.

The risk here is not prolonged exposure to mediocrity, it is prolonged exposure to intense optimization power, with mediocrity being what is optimized for, as controlling for variables increases data quality and predictive power, which in turn is necessary to maximize use and minimize quit rates.

I think some of these are probably false positives, or were fixed by a random dev, or had a small effect size, but I also doubt all of them are. If there is 1) hill climbing based on massive sample sizes of human behavior data, and 2) continuous experimentation and optimization to measurably increase use time and reduce quit rates/​risk, then EA and AI safety people have probably been twisted out of shape in some of the ways described here.

Any one of these optimizations would be quite capable of twisting much of the AI safety community out of shape, as tailored environments (derived from continuous automated experimentation on large-n systems) create life experiences optimized in the direction of fitting the human mind like a hand fitting inside of a glove. At the scale that this optimization is taking place, comparing millions or even billions of minds to each other to find webs of correlations and predict future behavior such as belief formation, effects should be assumed to be intense by default.

Although algorithmic progress (from various ML applications, not LLMs) theoretically give tech companies more leeway to choose to detect and reduce these effects wherever they appear, they have little incentive to do so. Market competition means that they might successfully coordinate to steer systems and people away from highly visible harms like sleep deprivation, but less visible harms (like elite thinking being rendered mediocre) is harder to coordinate the industry around, due to the need to interfere with the algorithms that maximize use and the difficulty of measurement, and small sample sizes for strange effects on elite groups.

Furthermore, the system is too large and automated to manage. A user could receive 100 instances of manipulation per hour or more, with most of the instances being calculated by automated systems on the same day as they are deployed, and with each instance taking the form of a unique video or post (or combination of videos or posts) that deploys a concept or combination of concepts in strategic ways (unlike LLMs, these concepts and combinations are measured entirely by measurable effect such as reducing quit rates or causing frequent negative reactions to a targeted concept, rather than understanding of the content of the message; only the results are being measured, and there is no limit to how complex or galaxy-brained the causal mechanism can be, affecting the deep structures of the brain from all sorts of angles, so long as it steers users behavior in the correct direction e.g. reducing quit rates).

This is why social media use, in addition to vastly increasing the attack surface of the AI safety community in a changing world, is also likely to impede or even totally thwart self-improvement. Inducing mediocrity controls for variables in a complex system (the human mind) which increases data quality and behavior prediction success rates, but inducing mediocrity also reduces altruism effectiveness substantially.

Therefore, even in a world where deliberate adversarial use of social media was somehow guaranteed to not happen by the people running or influencing the platforms, which is not the world we live in, social media users would still not be able to evaluate whether social media use is appropriate for the AI safety community.

Part 2: Deliberate use by hostile actors

Of course, if anyone wanted to, they could crank up any of these dynamics (or all of them at once), or alternate between ramping the entire community up or down along major events like the FTX collapse, or even targeting specific kinds of people. Intelligence is not distributed evenly among humans, but most people tend to think they’re clever and pat themselves on the back for thinking of a way to make something they did look like an accident or like natural causes (or, in particular, making it look like someone else did it, which is also one of the candidates for “why human intelligence evolved to sometimes succeed at all at observing objective reality instead of just social reality” as these capabilities caused people to successfully eliminate less-savvy rivals, increasing the probability that they themself become the tribe’s chief and subsequently maximize their own offspring). This situation becomes worse when these “natural causes” are prevalent on their own.

Most or all of these dynamics, and many others not listed here, can be modulated as variables based on what maximizes visible disruption. Notably, in particular, including making individuals in the community increasingly fixated on low-trust behaviors such as obsessions with power games or zero-sum social status competitions, which causes distrust death spirals.

Precise steering is unnecessary if you can adjust social graphs and/​or turn people into sociopaths who see all of AI safety as their personal sacrificial lamb (as sociopaths do). Nihilism is a behavior that is fairly easy to measure, and therefore easy to induce.

This also includes clown attacks, unpersuasive bumbling critics of social media cause people to perceive criticism of social media as low-status, which measurably reduces quit rates (regardless of whether the devs are aware of the exact causal dynamic at all). It’s probably not very hard to trigger social media arguments at strategic times and places, because there are billions of case studies of tension building up and releasing as a person scrolls through a news feed (with each case study accompanied by plenty of scrolling biodata).

Social media platforms face intense market incentives to set up automatic optimization to make people feel safe, as users only use it if they think they are safe, resulting in rather intense and universal optimization on the platform to combine posts generate a wide variety of feelings that increase the probability that a diverse variety of users all end up feeling safe while on the platform (including, but not limited to, unintentionally persuading them to feel unsafe when they are off the platform, e.g. increasing visibility of content that makes them worry that they will accidentally say something racist IRL, losing all of their social status, if they are not routinely using the platform and staying up-to-date about the latest things that recently became racist to say).

Artificial manufacturing of Ugh Fields around targeted concepts is trivial, although reliably high success rates are much harder, especially for extraordinary individuals and groups (although even for harder targets, there are so many angles of attack that eventually something will stick in a highly measurable way).

News feeds are capable of maximizing the feeling of relief and satisfaction, while simultaneously stimulating cognition such that people are drained in ways less percievable than stress as it is conventionally understood, similar to how Valentine managed to notice that he felt drained a few hours after drinking coffee. Social media platforms are allowed to mislead users into falsely believing that stress is being relieved, and this would happen by default if a false belief of stress relief maximizes use, while simultaneously the best and most useful data happens to be produced by stimulating parts of the brain that are easily spent over the course of the day, or are useful for other things such as creative thinking. Or maybe that combination happens to be what minimizes quit rates e.g. “feeling engaged”.

Attacks (including inducing or exacerbating periods of depression or akrasia or Malthusian nihilism such as the post-FTX Malthusian environment) can even be ramped up and down to see what happens; or for no reason, just to trip our sensors or see if we notice at all. That is how great the power difference is with the current situation.

Exposing yourself to this degree of optimization pressure, in an environment as hostile and extractionary as this, is just not a reasonable decision. These systems are overwhelmingly stacked towards capabilities to observe and cause changes in human behavior, including beliefs and values.

Data collected and stored on AI safety-focued individuals now can also be used against individuals and orgs and communities in the near future, as AI safety becomes more prominent, as algorithmic progress improves (again, ML, not LLMs), and as power changes hands to people potentially less squeamish.

Part 3: Implications and Commentary

Decoupling that’s actually hard

This is a great opportunity to practice decoupling where it’s actually hard. The whole thing with high decoupling and low decoupling revolves around a dynamic where people have a hard time decoupling on issues related to matters that they actually care about. AGI is an issue that people correctly predict matters a ton (the enemy’s gate is down), whereas religious worship and afterlives is an issue that people incorrectly predict matters a ton.

The variation within a genetically diverse species like humans implies that plenty of people will have a hard time taking AI safety seriously in the first place, whereas plenty of others will have a hard time decoupling AI safety from the contemporary use of AI manipulation. Succeeding at both of these things is necessary to intuitively and correctly understand that the contemporary use of AI for manipulation is instrumentally valuable to understand in order to understand AI race dynamics and the community attack surface, while simultaneously not being valuable enough to compete with AI safety as an EA cause area. People like Gary Marcus did not even pass the first hurdle, and people who can’t pass both hurdles (even after I’ve pointed them out) are not the intended audience here.

A big problem is that there’s just a ton of people who don’t take AI safety seriously, or failed to take it seriously than much smaller issues, and over time learned to be vague about this whenever they’re in public, because whenever they reveal that they don’t take AI safety seriously they get RLHF’d into not doing that by the people in the room who try to explain why that’s clearly wrong. These RLHF’d people are basically impostors and they’re unambiguously the ones at fault here, but it sure does make it hard to write and get upvoted about topics directly decision relevant for the AGI situation but aren’t themselves about AGI at all.

After understanding mass surveillance, extant human manipulation and research, the tech industry, and AI geopolitics as well as possible, you are ready to understand the domain that AI safety takes place in, and have solid world models to use on issues that will actually matter in the end, like AGI, or community building/​health/​attack surface.

Inscrutable and rapidly advancing capabilities within SOTA LLMs might make it difficult for people to mentally disentangle modern user data-based manipulation from AGI manipulation (e.g. the probably hopeless task of AI boxing), but it would be nice if people could at least aspire to rise to the challenge of coherently disentangling important concepts.

Inner vs Outer community threat model

The focus on internal threats within the AI safety community is associated with high social status, whereas the focus on external threats facing the AI safety community is associated with low status.

This is because the inner threat model implies the ability to detect and eliminate threats within the community, which in turn implies the ability to defeat other community members while avoiding being defeated by them, which implies the capability to ascend.

Meanwhile, an outer threat model is low status, as it signals a largely gearless model of the AI safety community, similar to Cixin Liu’s idealistic, conflict-free depiction of international affairs in his novel, The Three-Body Problem.

In reality, there is something like a Mexican standoff between various AI safety orgs and individuals, as it’s difficult to know whether to share or hoard game-changing strategic information, especially since you don’t know what other game-changing strategic information will be discoverable downstream of any piece of strategic information that is shared by you or your org.

After getting to know someone well, you can be confident that they look friendly and sane, but you can’t be confident that they’re not a strategic defector who is aware of the value of spending years looking friendly and sane, and even given that they are friendly and sane in the present, it is hard to be confident that they won’t turn unfriendly or insane years in the future.

Needless to say, there are in fact external threats, and the AI safety community could easily be destroyed or twisted by them choosing to use technology that already exists and has existed for a long time.

The dynamics described in Geeks, Mops, and Sociopaths, where sociopaths inescapably infiltrate communities, exploit goodhart’s law to pose as geeks, and eliminate the geeks without the geeks even knowing what happened, is trivial to artificially manufacture (even in elite groups) for people who control or influence a major social media platform. This is because sociopaths are trivial to artificially manufacture, as the first section of this post demonstrated that nihilism is trivial to artificially manufacture if someone can modulate the variables and maximize for nihilism (though only to the extent that it is measurable).

You don’t even need to model the social graph and turn people in strategic locations (although this is also possible with sufficient social graph research, possibly requiring sensor data), because sociopaths automatically orient themselves even if spawned in random locations, like scattering a drone swarm, and then the drones autonomously hone in on targets at key choke points.

This is only one facet of the AI safety community’s attack surface, there are so many others. Even one org that becomes compromised or influenced becomes a threat to all the others. Giving attackers wiggle room to enter the space and sow divisions between people, launching attacks from perfect positions, acting as a third party that turns two inconvenient orgs against each other, or just outright steer the thinking of entire orgs, all of this just rewards attackers and incentivises further attacks.

I know that some people are going to see this and immediately salivate over the opportunity to use this to cast spells to try to steer rival orgs off of a cliff, but that’s not how it works in this particular situation. You can only continue giving hackers more information and power over the entire space, or you can try to stop the bleeding. This is easily provable:

Drone armies

It can create and mass produce the perfect corporate employee, maximizing the proportion of software engineers who are docile, and detecting and drawing a social graph for the employees who aren’t docile, based on unusual social media behavior and possibly sensor data as well. These capabilities have likely already been noticed and pursued.

This, along with lie detectors that actually work, also represent some of the most promising opportunities for intelligence agencies to minimize snowden risk among their officers and keyboard warriors, totally unhinging intelligence agencies from any possibility of accountability e.g. poisoning and torturing non-ineffective dissidents (ineffective dissidents, in contrast, will likely be embraced due to their prevalence and due to democratic tradition), or people with a known probability of becoming non-ineffective dissidents e.g. ~20%, or, depending on scale, even the relatively small number of people who are too unpredictable for systems to be confident of a low probability of becoming non-ineffective dissidents (e.g. the few people who cannot be established to have a >99% chance of of lifelong docility).

There is no historical precedent of intelligence agencies existing without these constraints, although there is plenty of historical precedent of widespread utilization of plausible deniability, as well as intelligence officials being optimistic, not pessimistic, about their ability to succeed at getting away with various large-scale crimes.

The universe is allowed to do this to you

I think this also demonstrates just how difficult it is to get things right in the real world, in a world that’s been changing as fast as it’s been. Did you have an important conversation, which you shouldn’t have had with a smartphone nearby, while a smartphone was nearby (or possibly even directly touching your body and the biodata it emits)? Your model of reality wasn’t good enough, and now you and everyone associated with you is NGMI. Oops! You also get no feedback because everything feels normal even though it isn’t, and therefore you are likely to continue having conversations near smartphones that you shouldn’t have had near smartphones. Survival of the fittest is one of the most foundational, fundamental, and universal laws of reality for life (and agents, a subcategory of life).

If you encounter someone who surprises you by putting a syringe into one of your veins, it’s important to note that they might not push the plunger and deploy the contents of the syringe!

Even a psychopath has tons of incentives to not do that, unlike with social media, where influence is carefully sculpted to be undetectable, and where the companies have potentially insurmountable incentives to addict/​push the plunger.

Needlessly to say, the correct decision is to immediately move to protect your brain/​endocrine system, whose weak point is your bloodstream. The correct decision is not to turn around and shout down the people behind you saying that you shouldn’t let strangers put syringes in your veins.