(4 min read) An intuitive explanation of the AI influence situation

This is a 4-minute read, inspired by the optimized writing in Eukaryote’s Spaghetti Towers post.

My thinking is that generative AI has potential for severe manipulation, but that the 2010s AI used in social media news feeds and other automated systems are a bigger threat, and this tech tells us much more about the future of international affairs and incentives for governments to race to accelerate AI, has fewer people aware of it, has a substantial risk of being used to attack the AI safety community, and the defenses are easy and mandatory to deploy. This post explains why this tech is powerful enough to be a centerpiece of people’s world models.

The people in Tristan Harris’s The Social Dilemma (2020) did a fantastic job describing the automated optimization mechanism in a quick and fun way (transcript).

[Tristan] A [stage] magician understands something, some part of your mind that we’re not aware of. That’s what makes the [magic trick] illusion work. Doctors, lawyers, people who know how to build 747s or nuclear missiles, they don’t know more about how their own mind is vulnerable. That’s a separate discipline. And it’s a discipline that applies to all human beings...

[Shoshana] How do we use subliminal cues on the Facebook pages to get more people to go vote in the midterm elections? And they discovered that they were able to do that.

One thing they concluded is that we now know we can affect real-world behavior and emotions without ever triggering the user’s awareness. They are completely clueless.

Important note: all optimization here is highly dependent on measurability, and triggering the user’s awareness is a highly measureable thing.

114501_1_09dreyfuss-video_wg_720p.mp4 [video-to-gif output image]

If anything creeps someone out, they use the platform less; such a thing is incredibly easy to measure and isolate causation. To get enough data, these platforms must automatically reshape themselves to feel safe to use.

This obviously includes ads that make you feel manipulated; it is not surprising that we ended up in a system where >95% of ad encounters are not well-matched.

This gives researchers plenty of degrees of freedom to try different kinds of angles and see what works, and even automate that process.

[Tristan] We’re pointing these engines of AI back at ourselves to reverse-engineer what elicits responses from us. Almost like you’re stimulating nerve cells on a spider to see what causes its legs to respond.

Important note: my model says that a large share of data comes from scrolling. The movement of scrolling past something on a news feed, with either a touch screen/​pad or a mouse wheel (NOT arrow keys), literally generates a curve.

It is the perfect biodata to plug into ML; the 2010s social media news feed paradigm generated trillions of instances of linear algebra, based on different people’s reactions to different concepts and ideas that they scroll past.

So, it really is this kind of prison experiment where we’re just, you know, roping people into the matrix, and we’re just harvesting all this money and… and data from all their activity to profit from. And we’re not even aware that it’s happening.

[Chamath] So, we want to psychologically figure out how to manipulate you as fast as possible and then give you back that dopamine hit...

[Sean Parker] I mean, it’s exactly the kind of thing that a… that a hacker like myself would come up with because you’re exploiting a vulnerability in… in human psychology...

[Tristan] No one got upset when bicycles showed up. Right? Like, if everyone’s starting to go around on bicycles, no one said, “Oh, my God, we’ve just ruined society. Like, bicycles are affecting people. They’re pulling people away from their kids. They’re ruining the fabric of democracy. People can’t tell what’s true.”

Like, we never said any of that stuff about a bicycle. If something is a tool, it genuinely is just sitting there, waiting patiently. If something is not a tool, it’s demanding things from you… It wants things from you. And we’ve moved away from having a tools-based technology environment to an addiction- and manipulation-based technology environment.

That’s what’s changed. Social media isn’t a tool that’s just waiting to be used. It has its own goals, and it has its own means of pursuing them by using your psychology against you.

In the documentary, Harris describes a balance between three separate automated optimization directions from his time at tech companies: the Engagement goal, to drive up usage and keep people scrolling, the Growth goal, to keep people coming back and encouraging friends, and the Advertising goal, which pays for server usage.

In fact, there is a fourth slot: optimizing people’s thinking in any measurable direction, such as making people feverishly in favor of the American side and opposed to the Russian side in proxy wars like Ukraine. The tradeoffs between these four optimization directions is substantial, and the balance/​priority distribution is determined by the preferences of the company and the extent of the company’s ties to its government and military (afaik mainly the US and China are relevant here).

Clown Attacks are a great example of something to fit into the fourth slot: engineer someone think a topic (e.g. lab leak hypothesis) is low-status by showing them low-status clowns talking about it, and high-status people ignoring or criticizing it.

It’s important to note that this problem is nowhere near as important as Superintelligence, the finish line for humanity. But it’s critical for world modelling; the last 10,000 years of human civilization only happened the way it did because manipulation capabilities at this level did not yet exist.

Information warfare was less important in the 20th century, relative to military force, because it was weaker. As information warfare becomes more powerful and returns on investment grow, more governments and militaries invest in information warfare than historical precedent would imply, and we end up in the information warfare timeline.

In the 2020s, computer vision makes eyetracking and large-scale facial microexpression recognition/​research possibly the biggest threat. Unlike the hopelessness of securing your operating system or having important conversations near smartphones, solution is easy and worthwhile for people in the AI safety community (and it is why me posting this is net positive). It only takes a little chunk of tape and aluminum foil (that was easy enough for me to routinely peel off most of it and replace afterwards).

I’ve written about other fixes and various examples of ways for attackers to hack the AI safety community and turn it against itself. Please don’t put yourself and others at risk.

Crossposted to EA Forum (1 point, 1 comment)