Paranoia: A Beginner’s Guide
People sometimes make mistakes. (Citation Needed)
The obvious explanation for most of those mistakes is that people do not have access to sufficient information to avoid the mistake, or are not smart enough to think through the consequences of their actions.
This predicts that as decision-makers get access to more information, or are replaced with smarter people, their decisions will get better.
And this is substantially true! Markets seem more efficient today than they were before the onset of the internet, and in general decision-making across the board has improved on many dimensions.
But in many domains, I posit, decision-making has gotten worse, despite access to more information, and despite much larger labor markets, better education, the removal of lead from gasoline, and many other things that should generally cause decision-makers to be more competent and intelligent. There is a lot of variance in decision-making quality that is not well-accounted for by how much information actors have about the problem domain, and how smart they are.
I currently believe that the factor that explains most of this remaining variance is “paranoia”. In particular the kind of paranoia that becomes more adaptive as your environment gets filled with more competent adversaries. While I am undoubtedly not going to succeed at fully conveying why I believe this, I hope to at least give an introduction into some of the concepts I use to think about it.
A market for lemons
The simplest economic model of paranoia is the classical “lemon’s market”:
In the classical lemon market story you (and a bunch of other people) are trying to sell some nice used cars, and some other people are trying to buy some nice used cars, and everyone is happy making positive-sum trades. Then a bunch of defective used cars (“lemons”) enter the market, which are hard to distinguish from the high-quality used cars since the kinds of issues that used cars have are hard to spot.
Buyers adjust their willingness to pay downwards as the average quality of car in your market goes down. This causes more of the high-quality sellers to leave the market as they no longer consider their car worth selling at that lower price. This further reduces the average willingness to pay of the buyers, which in turn drives more high-quality sellers out of the market. In the limit, only lemons are sold.
In this classical model, a happily functioning market where both buyers and sellers are happy to trade, generating lots of surplus for everyone involved, can be disrupted or even completely destroyed[1] by the introduction of a relatively small number of adversarial sellers who sell sneakily low-quality goods. From the consumer side, this looks like one day you having a fine and dandy time buying used cars, and the next day being presented with a large set of deals so suspiciously good that you know something is wrong (and you are right).
Buying a car in a lemon’s market is a constant exercise of trying to figure out how the other person is trying to fuck you over. If you see a low offer for a car, this is evidence both that you got a great deal, and evidence that the counterparty knows something that you don’t that they are using to fuck you over. If the latter outweighs the former, no deal happens.
For some reason, understanding this simple dynamic is surprisingly hard for people to come to terms with. Indeed, the reception section of the Wikipedia article for Akerlof’s seminal paper on this is educational:
Both the American Economic Review and the Review of Economic Studies rejected the paper for “triviality”, while the reviewers for Journal of Political Economy rejected it as incorrect, arguing that, if this paper were correct, then no goods could be traded.[4] Only on the fourth attempt did the paper get published in Quarterly Journal of Economics.[5] Today, the paper is one of the most-cited papers in modern economic theory and most downloaded economic journal paper of all time in RePEC (more than 39,275 citations in academic papers as of February 2022).[6] It has profoundly influenced virtually every field of economics, from industrial organisation and public finance to macroeconomics and contract theory.
(You know that a paper is good if it gets rejected both for being “trivial” and “obviously incorrect”)
All that said, in reality, navigating a lemon market isn’t too hard. Simply inspect the car to distinguish bad cars from good cars, and then the market price of a car will at most end up at the pre-lemon-seller equilibrium, plus the cost of an inspection to confirm it’s not a lemon. Not too bad.
“But hold on!” the lemon car salesman says. “Don’t you know? I also run a car inspection business on the side”. You nod politely, smiling, then stop in your tracks as the realization dawns on you. “Oh, and we also just opened a certification business that certifies our inspectors as definitely legitimate”, he says as you look for the next flight to the nearest communist country.
It’s lemons all the way down
What do you do in a world in which there are not only sketchy used car salesmen, but also sketchy used car inspectors, and sketchy used car inspector rating agencies, or more generally, competent adversaries who will try to predict whatever method you will use to orient to the world, and aim to subvert it for their own aims?
As far as I can tell the answer is “we really don’t know, seems really fucking hard, sorry about that”. There are no clear solutions to what to do if you are in an environment with other smart actors[2] who are trying to predict what you are going to do and then try to feed you information to extract resources from you. Decision theory and game theory are largely unsolved problems, and most adversarial games have no clear solution.
But clearly, in practice, people deal with it somehow. The rest of this post is about trying to convey what it feels like to deal with it, and what it looks like from the outside. These “solutions”, while often appropriate, also often look insane, and that insanity explains a lot of how the world has failed to get better, even as we’ve gotten smarter and better informed. These strategies often involve making yourself dumber in order to make yourself less exploitable, and these strategies become more tempting the smarter your opponents are.
Fighter jets and OODA loops
John Boyd, a US Air Force Colonel, tried to figure out good predictors of who wins fighter jet dogfights. In the pursuit of that he spent 30 years publishing research reports and papers and training recruits, ultimately culminating in his model of the “OODA loop”.
In this model, a fighter jet pilot is engaging in a continuous loop of: Observe, Orient, Decide, Act. This loop usually plays out over a few seconds as the fighter observes new information, orients towards this new environment, makes a decision on how to respond, and ultimately acts. Then they observe again (both the consequences of their own actions and of their opponent), orient again, etc.
What determines (according to Boyd) who wins in a close dogfight is which fighter can “get into” the other fighters OODA loop.
If you can...
Take actions that are difficult to observe...
Harder to orient to...
And act before your opponent has decided on their next action
You will win the fight. Or as Boyd said “he who can handle the quickest rate of change survives”. And to his credit, the formal models of fighter-jet maneuverability he built on the basis of this theory have (at least according to Wikipedia) been one of the guiding principles of modern fighter jet design including the F-15 and F-16 and are widely credited with determining much of modern battlefield strategy.
Beyond the occasional fighter-jet dogfight I get into, I find this model helpful for understanding the subjective experience of paranoia in a wide variety of domains. You’re trying to run your OODA loop, but you are surrounded by adversaries who are simultaneously trying to disrupt your OODA loop while trying to speed up their own. When they get into your OODA loop, it feels like you are being puppeted by your adversary, who can predict what you are going to do faster than you can adapt.
The feeling of losing is a sense of disorientation and confusion and constant reorienting as reality changes more quickly than you can orient to, combined with desperate attempts to somehow slow down the speed at which your adversaries are messing with you.
There are lots of different ways people react to adversarial information environments like this, but at a high level, my sense is there are roughly three big strategies:
You blind yourself to information
You try to eliminate the sources of the deception
You try to become unpredictable
All three of those produce pretty insane-looking behavior from the outside, yet I think are by-and-large an appropriate response to adversarial environments (if far from optimal).
The first thing you try is to blind yourself
When a used car market turns into a lemon’s market, you don’t buy a used car. When you are a general at war with a foreign country, and you suspect your spies are compromised and feeding you information designed to trick you, you just ignore your spies. When you are worried about your news being the result of powerful political egregores aiming to polarize you into political positions, you stop reading the news.
At the far end of paranoia lives the isolated hermit. The trees and the butterflies are (mostly) not trying to deceive you, and you can just reason from first principles about what is going on with the world.
While the extreme end of this is costly, we see a lot of this in more moderate form.
My experience of early-2020 COVID involved a good amount of blinding myself to various sources of information. In January, as the pandemic was starting to become an obvious problem in the near future, the discussion around COVID picked up. Information quality wasn’t perfect, but overall, if you were looking to learn about COVID, or respiratory diseases in general, you would have a decent-ish time. Indeed, much of the research I used to think about the likely effects of COVID early on in the pandemic was directly produced by the CDC.
Then, the pandemic became obvious to the rest of the world, and a huge number of people started having an interest in shaping what other people believed about COVID. The CDC started lying about the effectiveness of masks to convince people to stop using them so service workers would have access to them as political pressure on them mounted. Large fractions of society started wiping down every surface and trying to desperately produce evidence that rationalized this activity. Most channels that people relied on for reliable health information became a market for lemons as forces of propaganda drowned out the people still aiming to straightforwardly inform.
I started ignoring basically anything the CDC said. I am sure many good scientists still worked there, but I did not have the ability to distinguish the good ones from the bad ones. As the adversarial pressure rose, I found it better to blind myself to that information.
The general benefits to blinding yourself to information in adversarial environments are so commonly felt, and so widely appreciated, that constraining information channels is a part of almost every large social institution:
U.S. courts extensively restrict what evidence can be shown to juries
A lot of US legal precedent revolves around the concept of “admissible evidence”, and even furthermore “admissible argument”. We are paranoid about juries getting tricked, so we blind juries to most evidence relevant to the case we are asking them to judge, hoping to shield them from getting tricked and controlled by the lawyers of either side, but still leave enough information available to usually make adequate judgements.
Nobody is allowed to give legal or medical advice
While much of this is the result of regulatory capture, we still highly restrict the kind of information that people are allowed to give others on many of the topics that matter most to people. Both medical advice and legal advice are categories where we only allow certified experts to speak freely, and even there, we only do so in combination with intense censure if the advice later leads to bad consequences for the recipients.
Within governments, the “official numbers” are often the only things that matter
The story of CIA analyst Samuel Adams and his attempts at informing the Johnson administration about the number of opponents the US was facing in the Vietnam war is illustrative here. As Adams tells the story himself as he found what appeared to him very strong evidence of Vietnamese forces being substantially more numerous than previously assumed (600,000 vs. 250,000 combatants):
Dumbfounded, I rushed into George Carver’s office and got permission to correct the numbers. Instead of my own total of 600,000, I used 500,000, which was more in line with what Colonel Hawkins had said in Honolulu. Even so, one of the chief deputies of the research directorate, Drexel Godfrey, called me up to say that the directorate couldn’t use 500,000 because “it wasn’t official.”
[...]
The Saigon conference was in its third day, when we received a cable from Helms that, for all its euphemisms, gave us no choice but to accept the military’s numbers. We did so, and the conference concluded that the size of the Vietcong force in South Vietnam was 299,000.
[...]
A few days after Nixon’s inauguration, in January 1969, I sent the paper to Helms’s office with a request for permission to send it to the White House. Permission was denied in a letter from the deputy director, Adm. Rufus Taylor, who informed me that the CIA was a team, and that if I didn’t want to accept the team’s decision, then I should resign.
When governments operate on information in environments where many actors have reasons to fudge the numbers in their direction, they highly restrict what information is a legitimate basis for arguments and calculations, as illustrated in the example above.
The second thing you try is to purge the untrustworthy
The next thing to try is to weed out the people trying to deceive you. This… sometimes goes pretty well. Most functional organizations do punish lying and deception quite aggressively. But catching sophisticated deception or disloyalty is very hard. McCarthyism and the second red scare stands as an interesting illustration:
President Harry S. Truman’s Executive Order 9835 of March 21, 1947, required that all federal civil-service employees be screened for “loyalty”. The order said that one basis for determining disloyalty would be a finding of “membership in, affiliation with or sympathetic association” with any organization determined by the attorney general to be “totalitarian, fascist, communist or subversive” or advocating or approving the forceful denial of constitutional rights to other persons or seeking “to alter the form of Government of the United States by unconstitutional means”.[10]
What became known as the McCarthy era began before McCarthy’s rise to national fame. Following the breakdown of the wartime East-West alliance with the Soviet Union, and with many remembering the First Red Scare, President Harry S. Truman signed an executive order in 1947 to screen federal employees for possible association with organizations deemed “totalitarian, fascist, communist, or subversive”, or advocating “to alter the form of Government of the United States by unconstitutional means.”
At some point, when you are surrounded by people feeding you information adversarially and sabotaging your plans, you just start purging people until you feel like you know what is going on again.
This can again look totally insane from the outside, with lots of innocent people getting caught in the crossfire and a lot of distress and flailing.
But it’s really hard to catch all the spies if you are indeed surrounded by lots of spies! The story of the Rosenbergs during this time period illustrates this well:
Julius Rosenberg (May 12, 1918 – June 19, 1953) and Ethel Rosenberg (born Greenglass; September 28, 1915 – June 19, 1953) were an American married couple who were convicted of spying for the Soviet Union, including providing top-secret information about American radar, sonar, jet propulsion engines, and nuclear weapon designs. They were executed by the federal government of the United States in 1953 using New York’s state execution chamber in Sing Sing in Ossining,[1] New York, becoming the first American civilians to be executed for such charges and the first to be executed during peacetime.
The conviction of the Rosenbergs resulted in enormous national pushbacks to McCarthyism, with it playing a big role in the formation of its legacy as a period of political overreach and undue paranoia:
After the publication of an investigative series in the National Guardian and the formation of the National Committee to Secure Justice in the Rosenberg Case, some Americans came to believe both Rosenbergs were innocent or had received too harsh a sentence, particularly Ethel. A campaign was started to try to prevent the couple’s execution. Between the trial and the executions, there were widespread protests and claims of antisemitism. At a time when American fears about communism were high, the Rosenbergs did not receive support from mainstream Jewish organizations. The American Civil Liberties Union did not find any civil liberties violations in the case.[37]
Across the world, especially in Western European capitals, there were numerous protests with picketing and demonstrations in favor of the Rosenbergs, along with editorials in otherwise pro-American newspapers. Jean-Paul Sartre, an existentialist philosopher and writer who won the Nobel Prize for Literature, described the trial as “a legal lynching”.[38] Others, including non-communists such as Jean Cocteau and Harold Urey, a Nobel Prize-winning physical chemist,[39] as well as left-leaning figures—some being communist—such as Nelson Algren, Bertolt Brecht, Albert Einstein, Dashiell Hammett, Frida Kahlo, and Diego Rivera, protested the position of the American government in what the French termed the American Dreyfus affair.[40] Einstein and Urey pleaded with President Harry S. Truman to pardon the Rosenbergs. In May 1951, Pablo Picasso wrote for the communist French newspaper L’Humanité: “The hours count. The minutes count. Do not let this crime against humanity take place.”[41] The all-black labor union International Longshoremen’s Association Local 968 stopped working for a day in protest.[42] Cinema artists such as Fritz Lang registered their protest.[43]
Many decades later, in 1995, as part of the release of declassified information, the public received confirmation that the Rosenbergs were indeed spies:
The Venona project was a United States counterintelligence program to decrypt messages transmitted by the intelligence agencies of the Soviet Union. Initiated when the Soviet Union was an ally of the U.S., the program continued during the Cold War when it was considered an enemy.[67] The Venona messages did not feature in the Rosenbergs’ trial, which relied instead on testimony from their collaborators, but they heavily informed the U.S. government’s overall approach to investigating and prosecuting domestic communists.[68]
In 1995, the U.S. government made public many documents decoded by the Venona project, showing Julius Rosenberg’s role as part of a productive ring of spies.[69] For example, a 1944 cable (which gives the name of Ruth Greenglass in clear text) says that Ruth’s husband David is being recruited as a spy by his sister (that is, Ethel Rosenberg) and her husband. The cable also makes clear that the sister’s husband is involved enough in espionage to have his own codename (“Antenna” and later “Liberal”).[70] Ethel did not have a codename;[26] however, KGB messages which were contained in the Venona project’s Alexander Vassiliev files, and which were not made public until 2009,[71][72] revealed that both Ethel and Julius had regular contact with at least two KGB agents and were active in recruiting both David Greenglass and Russell McNutt.[73][71][72]
Turns out, it’s really hard to prove that someone is a spy. Trying to do so anyway often makes people more paranoid, which produces more intense immune reactions and causes people to become less responsive to evidence, which then breeds more adversarial intuitions and motivates more purges.
But to be clear, a lot of the time, this is a sane response to adversarial environments. If you are a CEO appointed to lead a dysfunctional organization, it is quite plausibly the right call to get rid of basically all staff who have absorbed an adversarial culture. Just be extremely careful to not purge so hard as to only be left with a pile of competent schemers.
The third thing to try is to become unpredictable and vindictive
And ultimately, if you are in a situation where an opponent keeps trying to control your behavior and get into your OODA, you can always just start behaving unpredictably. If you can’t predict what you are going to do tomorrow, your opponents (probably) can’t either.
Nixon’s mad dog strategy stands as one interesting testament to this:
I call it the Madman Theory, Bob. I want the North Vietnamese to believe I’ve reached the point where I might do anything to stop the war. We’ll just slip the word to them that, “for God’s sake, you know Nixon is obsessed about communism. We can’t restrain him when he’s angry—and he has his hand on the nuclear button” and Ho Chi Minh himself will be in Paris in two days begging for peace.
Controlling an unpredictable opponent is much harder than an opponent who in their pursuit of taking optimal and sane-looking actions ends up behaving predictably. Randomizing your strategies is a solution to many adversarial games, and in reality, making yourself unpredictable in what information you will integrate and which you will ignore, and where your triggers are for starting to use force, often gives your opponent no choice but to be more conservative, or ease the pressure, or aim to manipulate so much information that even randomization doesn’t save you.
Now, where does this leave us? Well, first of all, I think it helps explain a bunch of the world and allows us to make better predictions about how the future will develop.
But I think more concretely, I think it motivates a principle I hold very dear to my heart: “Do not be the kind of actor that forces other people to be paranoid”.
Paranoid people fuck up everything around them. Digging yourself out of paranoia is very hard and takes a long time. A non-trivial fraction of my life philosophy is oriented around avoiding environments that force me into paranoia and incentivizing as little paranoia as possible in the people around me.
- Don’t let people buy credit with borrowed funds by (14 Nov 2025 7:51 UTC; 107 points)
- Diagonalization: A (slightly) more rigorous model of paranoia by (16 Nov 2025 7:56 UTC; 33 points)
- Monthly Roundup #37: December 2026 by (12 Dec 2025 14:40 UTC; 28 points)
- Wiki AI by (16 Nov 2025 20:11 UTC; 27 points)
- 's comment on Don’t let people buy credit with borrowed funds by (14 Nov 2025 19:11 UTC; 7 points)
- 's comment on GradientDissenter’s Shortform by (16 Nov 2025 2:01 UTC; 5 points)
I downvoted this post because it felt slippery. I kept running into parts that didn’t fit together or otherwise seemed off.
If this was a google doc I might leave a bunch of comments quickly pointing to examples. I guess I can do that in list format here.
The post highlights the market for lemons model, but then the examples keep not fitting the lemons setup. Covid misinformation wasn’t an adverse selection problem, nor was having spies in the government, nor was the Madman Theory situation.
“there are roughly three big strategies” is a kind of claim that I generally start out skeptical of, and this post failed to back that one up
The description of the CDC and what happened during covid seems basically inaccurate, e.g. that’s not how concerns about surfaces played out
The Madman Theory example doesn’t feel like an example of the Nixon administration being in an adversarial information situation or being paranoid. It’s trying to make threats in a game theory situation.
The way you talk about the 3 strategies I get the sense that you’re saying: when you’re in an adversarial information scenario here are 3 options to consider. But in the examples of each strategy, the other 2 strategies don’t really make sense. They are structurally different scenarios.
A thing that I think of as paranoia centrally involves selective skepticism: having some target that you distrust, but potentially being very credulous towards other sources of views on that topic such as an ingroup that also identifies that target as untrustworthy or engaging in confirmatory reasoning about your own speculative theories that don’t match the untrusted target. That’s missing from your theory and your examples.
The thing you’re calling “blinding” includes approaches that seem pretty different. A source I distrust is claiming X, and X seems like the sort of thing they’d want me to believe, so I’ll (a) doubt X, (b) find some other method of figuring out whether X is true that doesn’t rely on that source, or c) go someplace else so that X doesn’t matter to me. I associate paranoia mainly with (a), or with doing an epistemically atrocious job of (b), or a self-deceiving variant of (c).
Despite all the examples, there’s a lack of examples that are examples of the core thing—here’s an epistemically adversarial situation, here’s a person being paranoid, here’s what that looks like, here’s how that’s relatively appropriate/understandable even if not optimal, (perhaps also) here’s how that involves a bunch of costs/badness (from the concluding paragraph this is maybe part of the core, though that wasn’t apparent before that point)
Appreciate the comment even if you disliked the post! Here are some responses to various bullet points in a kind of random order that made sense to me:
The market for lemons model is really just an extremely simple model of a single-dimensional adversarial information environment. It’s so extremely simple that it’s hard to fit to any real world situation, since it’s really just a single dimension of price signals. As you add more dimensions of potential deception things get more complex, and that’s when then paranoia stuff becomes more useful.
I think COVID misinformation fits the lemon’s market situation pretty well though of course not perfectly. Happy to argue about it. Information markets are a bit confusing because marginal costs and marginal willingness to pay are both very low, but I do think that at least for informed observers, most peaches (i.e. high-quality information sources) ended up being priced out by low-quality lemons and this created a race to the bottom (among various other things that were going on).
Also happy to argue about this. I feel pretty confident that the right model of why many people kept believing in surface transmission was roughly that early on it wasn’t ruled out, and it happened to be that there was a bunch of costly signaling you could do by wiping down lots of surfaces, and then those costly signals later on created strong pressure to rationalize the signals, even when the evidence was relatively clear.
Zvi’s COVID model has a bunch of writing on this: https://www.lesswrong.com/posts/P7crAscAzftdE7ffv/covid-19-my-current-model#Sacrifices_To_The_Gods_Are_Demanded_Everywhere
You can disagree with this narrative or explanation of course, but it’s definitely what I believe and something I thought a lot about!
I definitely agree that government spies do not fit the lemon’s market example particularly well, or like, only in as much as any adversarial information environment fits the model at all. I didn’t mean to imply otherwise! I don’t have a great formal model of the spies situation. Maybe there is one out there.
Strong disagree. I think combining all three is indeed the usual course of action! I think paranoid people in adversarial information environments basically always do all three. When I have worked at dysfunctional workplaces I have done all three, and I have seen other people do all three in other places that induced paranoia.
I actually mostly had trouble finding examples where someone was doing largely only one of these things instead of generally engaging in a whole big jumple of paranoia which I think can be decomposed into roughly these three strategies, but are hard to disentangle.
Like, IDK, think about your usual and normal depiction of a paranoid person in fiction (referring here to fiction because no immediate real-life reference we would both be intimately familiar enough to make the point comes to mind). I think you will find them doing all three of “becoming suspicious of large swaths of information and dismissing them for being biased”, “becoming vindictive, self-isolating and engaging in purges” and “becoming erratic and hard to predict”. IMO it’s clearly a package!
I agree I would have liked to motivate the madman example more. This specific example was largely the result of trying to find a case where you can see the strategy I am trying to illustrate more in isolation, with the causal pathway factored out (because in an international diplomacy context you have the part of the enemy trying to model you and predict what you will do and then use that against you, but you have less of the part where much of your information can be manipulated, as you generally have relatively little bandwidth with your adversaries, and you have strong pre-existing group identities that make purges irrelevant, unless you are dealing with a spy-heavy situation).
I am not amazingly happy with it as an example, but I do think the underlying strategy is real!
Huh, this doesn’t really match how I would use the term much at all. I would describe the above as “mindkilled” in rationalist terms, referring to “politics is the mindkiller”, but paranoia to me is centrally invoked by high-bandwidth environments that are hard to escape from, and in which the fear of manipulation is directed at people close and nearby and previously trusted. Indeed, one of the big things I would add to this post, or maybe will write in a separate post sometime this week, is that the most intense paranoia is almost always the result of the violation of what appeared to be long-established trust. Things like a long-term relationship falling apart, or a long-term ally selling out to commercial incentives, or a trusted institution starting to slip and lie and engage in short-term optimization.
If you are in an environment with a shared outgroup that you have relatively little bandwidth to, I think that is an environment with much potential for polarization and various forms of tribal thinking, but I don’t think it’s that strong of a driver of paranoia.
Yeah, I would like a post that has some more evocative core example. IMO the best option here would be a piece of fiction that shows the experience of paranoia from the inside. I do think this is a pretty hard thing to write, and for now largely outside of my range as a writer, though maybe I will try anyways!
I made some attempts at this when drafting this post, but it became pretty quickly clear to me that actually conveying the internal experience here would both take a lot of words and a bunch of skill, and trying to half-ass it would just leave the reader confused. I also considered trying to talk about my personal experiences with paranoia-adjacent strategies during e.g. my time working at CEA, but that seemed if anything more difficult than writing a fictional dialogue.
I would definitely love a better model of whether these really are the exhaustively correct strategies. I have some handwavy pointers to why I roughly think they are, but they are pretty handwavy at this point. Trying to elucidate them a tiny bit right now:
The fundamental issue that paranoia is trying to deal with is the act of an adversary predicting your outputs well-enough that to them, you can basically be treated as part of the environment (in MIRI-adjacent circles I’ve sometimes heard this referred to as “diagonalization”).
If I think about this in a Computer-Scienc-y way, I am imagining a bigger agent that is simulating a smaller agent, with a bunch of input channels that represent the observations the smaller agent makes of the world. Some fraction of those input channels can be controlled. The act of diagonalization is basically finding some set of controllable inputs that, no matter what the uncontrollable parts of the input say, result in the smaller agent doing what the bigger agent wants.
Now, in this context, three strategies stand out to me that conceptually make sense:
You cut off your internal dependence to the controlled input channels
You reduce the amount of information that your adversary has about your internals so they can model your internals less well
You make yourself harder to predict, either by performing complicated computations to determine your actions, or making what kind of computation you perform to arrive at the result highly dependent on input channels you know are definitely uncontrolled
And like… in this very highly simplified CS model, those are roughly the three strategies that make sense to me at all? I can’t think of anything else that makes sense to do, though maybe it’s just a lack of imagination. Like, I feel like you have varied all the variables that make sense to vary in this toy-model.
And of course, it’s really unclear how well this toy-model translates to reality! But it’s one of the big generators that made me think the “3 strategies” claim makes sense.
Curious about your response, but also, this comment is long and you shouldn’t feel pressured to respond. I do broadly like your thinking in the space so would be interested in further engagement!
That helped give me a better sense of where you’re coming from, and more of an impression of what the core thing is that you’re trying to talk about. Especially helpful were the diagonalization model at the end (which I see you have now made into a separate post) and the part about “paranoia to me is centrally invoked by high-bandwidth environments that are hard to escape from” (while gesturing at a few examples, including you at CEA). Also your exchange elsewhere in the comments with Richard.
I still disagree with a lot of what you have to say, and agree with most of my original bullet points (though I’d make some modifications to #2 on your division into three strategies and #6 on selective skepticism). Not sure what the most productive direction is to go from here. I have some temptation to get into a big disagreement covid, where I think I have pretty different models than you do, but that feels like it’s mainly a tangent. Let me instead try to give my own take on the central thing:
The central topic is situations where an adversary may have compromised some of your internal processes. Especially when it’s not straightforward to identify what they’ve compromised, fix your processes, or remove their influence. There’s a more theoretical angle on this which focuses on what are good strategies to use in response to these sorts of situations, potentially even what’s the optimal response to a sufficiently well-specified version of this kind of scenario. And there’s a more empirical angle which focuses on what do people in fact do when they think they might be in this sort of situation, which could include major errors (including errors in identifying what situation you’re in, e.g. how much access/influence/capability the adversary has in relation to you or how adversarial the relationship is) though probably often involves responses that are at least somewhat appropriate.
This is narrower than what you initially described in this post (being in an environment with competent adversaries) but broader than diagonalization (which is the extreme case of having your internal processes compromised, where you are fully pwned). Though possibly this is still too broad, since it seems like you have something more specific in mind (but I don’t think that narrowing the topic to full diagonalization captures what you’re going for).
I’m curious whether you got a chance to read my short story. If you can suspend your disbelief (i.e., resist applying the heuristic which might lead one to write the story off as uncritically polemical), I think you might appreciate :-)
I also have a short story about (some aspects of) paranoia from the inside.
You’re a beautiful writer :) If you ever decide to host a writing workshop, I’d love to connect/attend.
Great post. I’m going to riff on it to talk about what it would look like to have an epistemology which formally explains/predicts the stuff in this essay.
Paranoia is a hard thing to model from a Bayesian perspective, because there’s no slot to insert an adversary who might fuck you over in ways you can’t model (and maybe this explains why people were so confused about the Market for Lemons paper? Not sure). However, I think it’s a very natural concept from a Knightian perspective. My current guess is that the correct theory of Knightian uncertainty will be able to formulate the concept of paranoia in a very “natural” way (and also subsume Bayesian uncertainty as a special case where you need zero paranoia because you’re working in a closed domain which you have a mechanistic understanding of).
The worst-case assumption in infra-Bayesianism (and the maximin algorithm more generally, e.g. as used in chess engines) is one way of baking in a high level of paranoia. However, two drawbacks of that approach:
There’s no good way to “dial down” the level of paranoia. I.e. we don’t have an elegant version of maximin to apply to settings where your adversary isn’t always choosing the worst possibility for you.
The closest I have is the Hurwicz criterion, which basically sets the ratio of focusing on the worst outcome to focusing on the best outcome. But this is very hacky—the thing you actually care about is all the intermediate outcomes.
It’s not actually paranoid in a Knightian way, because what if your adversary does something that you didn’t even think of?
Another way of being paranoid is setting large bid-ask spreads. I assume that finance people have a lot to say about how to set bid-ask spreads, but I haven’t heard of any very elegant theory.
I think of Sahil’s live theory as being a theory of anti-paranoia. It’s the approach you take in a world which is fundamentally “friendly” to you. It’s still not very pinned-down, though.
I think your three approaches to dealing with an adversarial world all gestures to valuable directions for formal investigation. I think of “blinding yourself” in terms of maintaining a boundary. The more paranoid you are, the stronger a boundary you need to have between yourself and the outside world. Boundaries are Knightian in the sense that they allow you to get stuff done without actually knowing much about what’s in the external world. My favorite example here (maybe from Sahil?) is the difference between a bacterium and a cell inside a human body. A bacterium is in a hostile world and therefore needs to maintain strong boundaries. Conversely, a cell inside a body can mostly assume that the chemicals in its environment are there for its benefit, and so can be much more permeable to them. We want to be able to make similar adjustments on an information level (and also on a larger-scale physical level).
I think of “purging the untrustworthy” in terms of creating/maintaining group identity. I expect that this can be modeled in terms of creating commitments to behave certain ways. The “healthy” version is creating a reputation which you don’t want to undermine because it’s useful for coordination (as discussed e.g. here). The unhealthy version is to traumatize people into changing their identities, by inducing enough suffering to rearrange their internal coalitions (I have a long post coming up on how this explains the higher education system; the short version is here).
I think of “become unpredictable” in terms of asymmetric strategies which still work against entities much more intelligent than you. Ivan has a recent essay about encryption as an asymmetric weapon which is robust to extremely powerful adversaries. I’m reminded of an old Eliezer essay about how, if you’re using noise in an algorithm, you’re doing it wrong. That’s true from a Bayesian perspective, but it’s very untrue from a (paranoid) Knightian perspective. Another example of an asymmetric weapon: no matter how “clever” your drone is, it probably can’t figure out a way to fly directly towards a sufficiently powerful fan (because the turbulence is too chaotic to exploit).
I think that the good version of “become vindictive” is something to do with virtue ethics. I think of virtue ethics as a strategy for producing good outcomes even when dealing with entities (particularly collectives) that are much more capable than you. This is also true of deontology (see passage in HPMOR where Hermione keeps getting obliviated). I think consequentialism works pretty well in low-adversarialness environments, virtue ethics works in medium-adversarialness environments, and then deontology is most important in the most adversarial environments, because as you go from the former to the latter you are making decisions in ways which have fewer and fewer degrees of freedom to exploit.
Hopefully much more on all of this soon, but thank you for inspiring me to get out at least a rough set of pointers.
I don’t see why you’d think that. The market for lemons model works perfectly in a bayesian framework. It’s very easy to model the ways that the adversary fucks you over. (They do it by lying to you about what state of the world you’re in, in a setting where you can’t observe any evidence to distinguish those states of the world.)
Corrolary: You’d have an inconsitent definition of “paranoia” if you simultaneously want to say that “market for lemons” involves paranoia, and also want to treat “Bayesian uncertainty as a special case where you need zero paranoia”.
I agree with Richard here, in that the market for lemon’s model is really hard to extend to the full “open source game theory case”.
Proving this isn’t very hard. Bayesianism has a realizability assumption, but of course if you have an adversary that is smarter than you, you can’t model them, and so you can’t form a bayesian hypothesis about what they will do.
This then extends into logical uncertainty, and solutions to logical uncertainty produce things that are kind of bayesian but not quite (like logical induction), and how to actually translate that into real-life thinking feels still like a largely open question.
Yeah, I don’t disagree with anything in this comment. I was just reacting to the market for lemons comparison.
Ah, cool, makes sense!
Fair point. Let me be more precise here.
Both the market for lemons in econ and adverse selection in trading are simple examples of models of adversarial dynamics. I would call these non-central examples of paranoia insofar as you know the variable about which your adversary is hiding information (the quality of the car/the price the stock should be). This makes them too simple to get at the heart of the phenomenon.
I think Habyrka is gesturing at something similar in his paragraph starting “All that said, in reality, navigating a lemon market isn’t too hard.” And I take him to be gesturing at a more central description of paranoia in his subsequent description: “What do you do in a world in which there are not only sketchy used car salesmen, but also sketchy used car inspectors, and sketchy used car inspector rating agencies, or more generally, competent adversaries who will try to predict whatever method you will use to orient to the world, and aim to subvert it for their own aims?”
This is similar to my criticism of maximin as a model of paranoia: “It’s not actually paranoid in a Knightian way, because what if your adversary does something that you didn’t even think of?”
Here’s a gesture at making this more precise: what makes something a central example of paranoia in my mind is when even your knowledge of how your adversary is being adversarial is also something that has been adversarially optimized. Thus chess is not a central example of paranoia (except insofar as your opponent has been spying on your preparations, say) and even markets for lemons aren’t a central example (except insofar as buyers weren’t even tracking that dishonesty was a strategy sellers might use—which is notably a dynamic not captured by the economic model).
I agree with almost all of this, though I think you don’t need to invoke knightian uncertainty. I think it’s simply enough to model there being a very large attack surface combined with a more intelligent adversary.
See this section of my reply to Unnamed about some semi-formal models in the space:
Or maybe not full independence but very strong correlation. The details of this actually matter a lot, and this is where the current magic lives, but we can look at the “guaranteed output” case for now.
One of the problems I’m pointing to is that you don’t know what the attack surface is. This puts you in a pretty different situation than if you have a known large attack surface to defend, even against a smarter adversary (e.g. the whole length of a border; or every possible sequence of Go moves).
Separately, I may be being a bit sloppy by using “Knightian uncertainty” as a broad handle for cases where you have important “unknown unknowns”, aka you don’t even know what ontology to use. But it feels close enough that I’m by default planning to continue describing the research project outlined above as trying to develop a theory of Knightian uncertainty in which Bayesian uncertainty is a special case.
I’ve been thinking about this a lot recently. It seems we could generalize this beyond adversarialness to uncertainty more broadly: In a low-uncertainty environment, consequentialism seems more compelling; in a high-uncertainty environment, deontology makes sense (because as you go from the former to the latter you are making decisions in ways which rest on fewer and fewer error-prone assumptions).
However, this still feels unsatisfying to me for a couple of reasons: (1) In a low-uncertainty environment, there is still some uncertainty. It doesn’t seem to make sense for an actor to behave in violation of their felt sense of morality to achieve a “good” outcome unless they are omniscient and can perfectly predict all indirect effects of their actions.[1] And, if they were truly omniscient, then deontic and consequentialist approaches might converge on similar actions—at least Derek Parfit argues this. I don’t know if I buy this, because (2) why do we value outcomes over the experiences by which we arrive at them? This presupposes consequentialism, which seems increasingly clearly misaligned with human psychology—e.g., the finding that maximizers are unhappier than satisficers, despite achieving “objectively” better outcomes, or the finding that happiness-seeking is associated with reduced happiness.
Relating this back to the question of reasoning in high-adversarial environments, it seems to me that the most prudent (and psychologically protective) approach is a deontological one, not only because it is more robust to outcome-thwarting by adversaries but more importantly because it is (a) positively associated with wellbeing and empathy and (b) inversely associated with power-seeking. See also here.
Moreover, one would need to be omniscient to accurately judge the uncertainty/adversarialness of their environment, so it probably makes sense to assume a high-uncertainty/high-adversarialness environment regardless (at least, if one cares about this sort of thing).
Paranoia can itself constitute being fucked over by an agent that can induce it while not being weakened by it, a la cults encouraging their members to broadly distrust the world leaving the cult as the only trusted source of information, or political players inciting purges that’ll fall on their enemies.
Yeah, this is a big thing that I hope to write more about. Like, a huge dynamic in these kinds of conflicts is someone reinforcing the paranoia, which then elevates more flailing, causing someone’s environment to become more disorienting, which causes them to become worse at updating on evidence and become more disoriented, which then makes it easier to throw them off balance more.
Like, as is kind of part of this whole dynamic, the process you use to restabilize yourself in adversarial environments can itself be turned against you.
Nitpick:
Not necessarily. If you’re using a shoddy randomization procedure, a smarter opponent can identify flaws in it, and then they’d be better able to predict you than you can predict yourself. E. g., “vibes-based” randomization, where you just do whatever feels random to you at the moment, is potentially vulnerable in this way, due to your (deterministic) ideas regarding what’s appealing or what’s random.
Fair point! And not fully without basis.
I do think in reality, outside of a few quite narrow cybersecurity scenarios, you practically always have enough bits you can genuinely randomize against. I’ll still edit in a “probably”.
Maybe this is just me nitpicking, but as long as you limit your randomization process to a pool of allowed actions, you’re still largely predictable insofar as you predictably choose from within that pool of actions. For example, if you’re randomizing your actions on a digital market, you’re maybe deciding whether to buy or sell or hold or whatever, but you’re probably not considering a DDoS attack on the market etc. Or, even a human who tries to randomize their actions can’t randomly decide not to breathe so as to thwart their predictive adversary. And so on.
Yet another approach: A dictator surrounds himself with the people he used to know before he became the dictator, i.e. people, trust in whom was built in an environment where there was yet not much point in lying to him.
That’s the common explanation given, but from my understanding at least it’s (at least partially) incorrect and the original recommendations were due in part to not understanding Covid was airborne. And that was because of a major communication error ~60 years ago from “Alexander Langmuir, the influential chief epidemiologist of the newly established CDC” who had pressed back on airborne viruses being common as he thought they were too similar to miasma theory.
Langmuir seems to have finally accepted it after a period of time, but the focus was on <5 micron particles and that got snatched out of context and only particles smaller than that being airborne became the medical gospel.
At least, if this wired article about the scientists who tried to convince the WHO that Covid was airborne is accurate
Here’s a LW discussion thread of this article from back then, including a review comment I made with some extra info which I think still holds up well.
Another reason to doubt the “noble lie” theory is that it was easy to make masks out of commonly available materials like cloth.
The real reason that requests to save PPE for health care workers were made was that these workers needed N95 respirators to protect themselves from procedures (like intubation) that were already know to aerosolize (and thus make airborne) any respiratory virus or bacteria, and surgical masks were needed for protection during medical procedures unrelated to Covid.
It occurs to me this is connecting to the concepts in From Personal to Prison Gangs: Enforcing Prosocial Behavior. That post notes how dramatically increasing the number of prisoners in a prison means the prisoners have a much harder time establishing trust and reputation, because there’s too many people to keep track of. The result is prison gangs: gang leaders are few enough they can manage trust between each other, and then they are responsible for ensuring their gang members follow “the rules.”
I just want to note that there doesn’t need to be anything adversarial about that. I see this sentence as a very on-point articulation of what is going on in people’s mind when they are angry about technological or societal changes.
This comment motivated me to crosspost Orient Speed in the 21st Century from my other blog.
Really? I thought only medical/legal/financial professionals have to write “not a medical/legal/financial advice” disclaimers. (I’m not from US)
My understanding is that the way it works is that you accept a non-trivial amount of liability even if you say that it’s not medical and legal advice, but it usually isn’t worth it. Beyond that, many of the world’s institutions have limited themselves to only accept “official” legal/medical/financial advice, and so this kind of information has a lot of trouble propagating.
I agree it’s not the case you can’t talk about these things at all! I might clean up the wording to make that more clear.
This treatment suffers from being framed entirely in terms of loss-aversion rather than in positive terms that balance costs against benefits. An important remedy to spies and subversion is above-board investigative processes like open courts. But then you have to be thinking in terms of the info and outcomes you want, not just in terms of who might get one over on you.
Hmm, I feel like most of my motivating examples are symmetric in costs and benefits?
In the lemon’s market you are trying to either buy or sell a car at a price that makes it worth it.
In the OODA loop dogfighting example you are trying to get in a position to take down your opponent
I agree that open courts are among the best tools for dismantling paranoia-inducing environments. My ideal post would have a whole section on those! I just didn’t have the time to cover it all, and am hoping to write more about courts and above-board investigative processes in future posts.
The lemon market example seems poorly motivated. Presumably many buyers want to get around conveniently, and they have to decide whether any given offer is the right solution to that problem. If the quality of available cars declines all-around because lemons drive the good sellers out, that’s bad news, but doesn’t mean buyers should be spending more effort sussing out lemons; they should only be doing this if they think there are likely non-lemons available!
Sure! I don’t think I said anywhere that buyers should be spending more effort sussing out lemons? The lemon market example is trying to introduce a simple toy environment in which a transition from a non-adversarial to an adversarial information environment can quickly cause lots of trades to no longer happen and leave approximately everyone worse off.
At least my model of the reader benefits from an existence proof of this, as I have found even this relatively simple insight to frequently be challenged.
“Buying a car in a lemon’s market is a constant exercise of trying to figure out how the other person is trying to fuck you over” would seem to be saying that the more lemons there are, the more effort buyers should be spending sussing out lemons.
Yeah, I am not enthused about that paragraph, honestly. I do still think it tracks, but it’s not great.
Like, in that section of a post, I am trying to make a conceptual jump from the simple economic model, to the lived experience of what it’s like to be an actor within that economic model (in the above example I should have said “buying a car in a market that is transitioning from a peach market to a lemons market”). Economic models generally do not cover the experience of what the algorithms that produce the outcome feel like from the inside, but in this case, it felt like an important jump to make.
I didn’t intend to communicate that thereby the classical lemons market model predicts that you should spend more resources trying to figure out which cars are peaches and lemons. Indeed, in the classical lemons market model one of the core assumptions is that there is no way for you to tell. I do cover the slightly adjusted case of there being a costly inspection you can perform to tell whether a car is a lemon or a peach a few paragraphs later.
IDK, maybe that paragraph is bad. I am not super attached to it. It was the result of specific user feedback of someone being like “man, you talk about all of this abstractly, but clearly a lot of the point of this post is to talk about the internal experience of being in one of those situations, and I feel like I need more pointers towards that”.
I immediately thought about warranties. It’s not a perfect solution, but maybe if you buy a used car with a warranty that will cover possible repairs, you could feel a bit safer, assuming the dealer doesn’t disappear overnight? Or at least, it can reduce your problem from inspecting a car to inspecting a textual contract: for example, running it through an LLM to find potential get-out-free clauses. And the same kind of solutions can apply to lemon markets more generally.
Curated. I think it’s long been a problem that LessWrong doesn’t have great models on how to handle adversarial situations. I’ve been wanting Habryka to write up their thoughts on this for awhile.
I think this post is more like re-establishing some existing concepts that have already been in the water supply (as opposed to adding something new), but it does a good job introducing them in a way that sets up the problem, with a kind of practical mindset. It does a good job motivating why you’d want a more fleshed out model for thinking about this, and, I think a decent job at conveying some default options people have in today’s world, and why those options aren’t sufficient.
I was glad to see some comments discussing “how would we build a formal epistemology, that explicitly incorporates adversarial action? What’s the current state of the art, and what are the obstacles to moving forward with that?”
I think the penultimate paragraph “Do not be the kind of actor that forces other people to be paranoid” is very important. I maybe wish it got a bit more signposting. I’m guessing/hoping there will be future posts digging more into it. I think for this essay, it’d be nice if that part had a section header that at least made that final takeaway stand out a bit more.
I don’t know if this was intended, but up until the end I was reading this post thinking you meant in this paragraph that the variance is explained by people not being paranoid enough or not paranoid in the right way and that is why you explain in this post how to be paranoid properly.
I do also think that people suck at being paranoid in the right way, but it’s a tricky problem.
I am hoping to write more about how to be paranoid in the right way, or avoid paranoia-inducing environments (my post yesterday was pointing at one such thing), but it’s something I am somewhat less confident in than the basic dynamics.
I don’t comment a lot, but I felt this one was definitely worth the read and my time.
While I don’t necessarily agree with every aspect, much of this resonated with how I see social media has (been) warped from a regular market of social connection to a lemon market, where the connection is crappy, and many sane people I know are blinding themselves to it (leaving in some corners behind a cesspool of the dopamine hit addicted).
Ultimately, this also seems to be true about how people have responded to the latest wave of human-rights initiatives (DEI) carried into the workplace by HR departments, where a small number of bad actors have capitalized on the overall naive assumption that “supporting the underdog is a good thing to do.”
The predictability of human behavior creates an attack surface for actors who can find ways to extract value from this fact, and this will certainly apply to how humans interact with AI. I found it interesting that on the same day as this article hit my email inbox, Bruce Schneier’s Cryptogram (monthly newsletter) also contained a reference to the OODA loop, and the adversarial attempt to “get into” one’s enemy’s loop in order to exert control and win.
Our (consumer basis) naive trust in the moral neutrality of LLM remains unchanged, it is only a matter of time until some actors will find a near perfect attack surface to get far deeper into our decision making than social media ever could…
Good post.
A thing that’s not mentioned here but is super salient to my personal experience of navigating an untrustworthy world is reputation-based strategies—asking your friends, transitive trust chains, to some extent ratings/reviews/forum commments are this-ish too. This is perhaps a subset of “blind yourself” (I often do kind of blind myself to anything not recommended by friends, when I can afford to). And I guess this kind of illustrates how even though we’re in an age where anyone can in principle access any information and any thing, it’s common to restrict oneself to what one can access through one’s personal network, as though we were village-dwellers of centuries past.
I’d enjoy a follow-up post going into more detail on this.
Yep, reputation-based strategies are a huge part of how to navigate this. I have lots of thoughts about it, and hope to write more about it!
...also I think it is in part my paranoia-based epistemic strategies that cause me to violently bounce off of a lot of LLM output, which is a pretty disabling amount of blinding myself for the current environment, I guess
Eliezer’s two posts on security mindset seem especially relevant here. I’d say you need what I might call generalized security mindset to deal with more domains becoming filled with intelligent adversaries.
My vague model of Habryka didn’t find those very helpful for thinking about this problem.
Explain?
I think he said to me that he disagrees with them, on their practical utility.
Another great example from Tarski, Frechet, and Lebesgue here!
Great post, with lots of application for startups especially in how you choose which ideas to work on. I’ve had a “claim” philosophy in contrast to the idea of “positioning” in that the latter is a passive activity, it is what gets assigned to you by the market instead of what you push onto the market, e.g., how Slack positions itself (workplace productivity) versus how people actually use it and talk about it (chatting app for work). Which is also why it’s notoriously difficult to find the right positioning because you’re essentially trying to guesstimate what the market thinks of you; in case of new startups, there is no market so it gets even harder to pick the right positioning.
That led me to the idea of a making a claim instead of positioning. An entrepreneur is basically someone who looks at the status quo in some domain and claims, “I bet I can do that better (in some way)”. For example, Toyota’s claim can essentially be boiled down to “We make the most durable cars” (in contrast, their positioning would likely be: “Car manufacturer for suburban dads” etc.).
For a long time I’ve struggled to communicate exactly what makes for a good claim, since you can claim anything but your breakdown of the OODA loop + the 3 strategies help a lot. If you tune out the misleading sources of information (i.e., hype gurus on YouTube or Linkedin) and become unpredictable by acting on signals that no one else in the market have received yet (e.g., reading books from the 1800s instead of techcrunch articles), and take actions that are hard to take for your (would-be) competitors (e.g., building foundational tech instead of wrappers), you may be onto something.
My framework is not very rigorous yet, but you’ve given me some homework to do :)
Great post—enjoyable read and connected some concepts I hadn’t considered together before.
The first thing that immediately comes to mind when I think about how to act in such an environment is reputation: trying to determine which actors are adversarial based on my (or other’s) previous interactions with them. I think I would try this before resorting to the other three tactics.
For example, before online ratings became a thing, chain restaurants had one significant advantage over local restaurants: if you were driving through and needed a place to stop and eat, the chain may have an expected quality that is worse than the local restaurant, but at least you knew you weren’t going to be sick because of their reputation.
And now that we have Yelp, I can just look there and see which restaurants consistently get 4-5 stars and no complaints of food poisoning, chain or not.
Of course if a restaurant was really trying to deceive you, it could bury negative reviews in thousands of bot ratings (I actually don’t know if this is truly possible given moderation of many rating platforms). And when the environments get more adversarial and the benefits of deceiving you become higher (or the costs of taking the honest route become higher), I imagine a reputation-based filter like this could be easily thwarted. So this feels like an intuitive and lower-cost first line of defense, but not a reason to fully retire the big guns that you talked about in this post.
Yeah, totally agree. In general group dynamics around this kind of adversarial stuff are things that I was only able to get a bit into.
That said, I was hoping to include this kind of strategy under the broad umbrella of “You try to purge the untrustworthy”. Like, the less intense version of that is to just try to surround yourself with more trustworthy people, and generally increase the degree to which you have good measurements of trustworthiness.
Trouble starts brewing when your adversary can predict what you are going to do before you, yourself, even know what you are going to do. Which, as luck would have it, is likely to be an increasingly common feeling in the AI-dominated world.
In reddit’s legal advice forum commenters just proclaim their advice as not legal advice whether they are lawyers or not. Sometimes they recommend getting a lawyer instead of giving advice.
Get outside sources! Like public reputation.
Looking forward to more on paranoia.
One of my friends—who was the target of a vicious online witch hunt over their political beliefs—eventually adopted this strategy while vetting new members for their Discord server. Entryism (real or imagined) creates a multipolar trap where both sides are maximally insulated against outside beliefs.
In practice the way this problem is often solved nowadays is to find third-party internet forums where people can leave honest reviews that can’t be censored easily—such as google maps reviews or reviews on reddit or glassdoor job reviews or so on.
Google and Reddit can’t be trusted to be censorship-free either, but the instances of censorship there are often various govts (China, US, Russia etc) demanding censorship, as opposed to your ice cream seller demanding censorship.
Mass violence, especially collusion to apply violence between various parties (govts, religious institutions, families), is what makes information really censored, to the point where entire populations can be repressed and then made to love their repressors.
I think censor-resistant social media platforms are an important piece to solve this. I think leaking the secrets of powerful actors who use violence to censor others, is another important piece to solve this.
Makes sense! My personal preference is to openly declare who my enemies are, and openly take actions that will cause them to suffer. I’m much less keen on the cloak-and-dagger strategy that is required to make someone paranoid or then exploit said paranoia. Because I tend to openly declare who my enemies are, people who are not openly declared as my enemy can find it relatively easier to trust or atleast tolerate me in their circles.
I think fundamentally the world is held together by threats of mass violence, be it threats of nuclear war at a geopolitical level, or threats of mass revolt by armed citizens at a domestic level. Hence I think trying to avoid all conflict is bad—often conflict theory is the right approach and mistake theory is the wrong approach.
I support more people on lesswrong writing about how best to fight conflicts and win, rather than on how to avoid conflicts entirely.
P.S. If you liked this comment you should check out my website, a lot of my writing focusses explicitly on topics like this one.
The entirety of less wrong is functionally an if-by-whisky designed to avoid ever making a concrete claim or commitment while maximally developing valid logic with general application. Let’s just leave it laying around. Whoever wins the hard power games can tell us what it meant later! What are we actually talking about ahead of time? Who knows! For some reason this is maximally trustworthy to very rational people, and something like the logic of Teller, or Schelling, is nonsense. If you were to infer to the actual consequences of LessWrong posts on conflict from the perspective of what they would say about intent if made deliberately, it would be something like “we assume an asymmetrical position of unassailable strength, and expect to prosecute all further conflict as a mix of war-on-terror style tactics and COINTELPRO tactics, and our bluster about existential risk is about concealing our unilateral development and deployment of technologies with this capacity”. This logic is wrong, and that is why when there is a conflict, billions will die.
Could you please give an example of which posts you are talking about
Give me a week to review both the material my impression is about and whether being explicit is more likely to be helpful or harmful. This is obviously itself a paranoid style outburst and the act of becoming legible would be both a statement (probably paradoxically a rejection) of it’s utility and a possible iteration in an adversarial game.
Here’s a single thought as I mull this over: a perfected, but maximally abstracted algorithm for science is inherently dual use. If you are committed to actually doing science, then it’s taken for granted that you are applying it to common structures in reality, which are initially intuited by broadly shared phenomenological patterns in the minds of observers. But to the extent the algorithm produces an equation, it can be worked from either end to create a solution for the other end. So knowing exactly how scientific epistemics work, in principle, offers a path to engineering phenomenological patterns to suggest incorrect ontology. This is what all stage magic already is for vulgar epistemology (and con artistry and “dark epistemics” generally), this is already something people know how to do and profit from knowing how to do. It is, in a sense, something that has already always been happening. There is a red queen race between scientific epistemics and dark epistemics, and in this context, LessWrong seems to be trying to build some sort of zero-trust version of scientific epistemics, but without following any of the principles necessary for that. Much of the forums practices are extremely rooted in cultural norms around trust, about professionalism and politeness and obfuscating things when they reduce coordination rather than when they reflect danger. This is a path to coordination and not truth. Coordination overwhelmingly incentivizes specific forms of lossy compression that are maximally emotionally agreeable to whoever you want to coordinate, and lossy compression is anti-scientific. Someone who had both a perfected scientific algorithm and a derived map from it of what lossy compressions maximized coordination would basically become a singleton in direct proportion to their ability to output information over relevant surface area.
This is the central thing. There are also a bunch of minor annoying things that at times engender perhaps disproportionate suspicion. And there’s the fact that I keep going to you guy’s parties and seeing Hanania, Yarvin, and other entirely unambiguous assets of billionaires, intelligence agencies, legacy patronage networks, and so forth, many of whom have written tens of thousands of words publicly saying things like “empathy is cancer”, quoting Italian futurists, Carl Schmidt and Heidegger (and like not even the academic sanitized version of Heidegger), talking about how sovereignty is the only real human principle and rights and liberty are parasitic luxury beliefs, and simultaneously continuously, endlessly lying, at scale, pouring millions of dollars into social mechanisms for propagating lies, while simultaneously building trillions of dollars worth of machines for outputting information. And these guys are all around you guys, and happily traffic in all the same happy speech about science and truth and so forth, read your writings, again, are building trillions of dollars worth of machines to output information. And there are plausible developments wherein these machines become intelligent and autonomous and literally tile the entire accessible universe with whatever hodgepodge of directives they happen to have at one moment, which is fast approaching, in this specific aforementioned context. But it’s rude to be specific and concrete about your application of formalism, because what we need is more coordination, especially with the people with committed anti-scientific principles, because they have figured out the equivalent of the epistemic ritual of blood sacrifice (which we are opposed to, in theory), so let’s just keep doing formalism
What I mean by zero trust is that you’re counting on the epistemic system itself saving you from abuse of epistemics. This is impossible, because by itself it’s just syntax. The correlation between symbol and object is what matters and that remains a weak spot regardless of how many formal structures you can validate to map to how many abstract patterns. There is a reason that history degrades to the verisimilitude of story.
If anyone cares I will review the sequences and whatever lesswrong posts are in my email inbox history starting tomorrow but this will again just be “list of things that annoyed me into making inferences into things almost never captured by the literal text”, and I’m not sure how valuable that is to anyone.
OK, so you’re talking about the conjunction of two things. One is the social and political milieu of Bay Area rationalism. That milieu contains anti-democratic ideologies and it is adjacent to the actual power elite of American tech, who are implicated in all kinds of nefarious practices. The other thing is something to do with the epistemology, methodology, and community practices of that rationalism per se, which you say render it capable of being coopted by the power philosophy of that amoral elite.
These questions interest me, but I live in Australia and have zero experience of the 21st century Bay Area (and of power elites in general), so I’m at a disadvantage in thinking about the social milieu. If I think about how it’s evolved:
Peter Thiel was one of the early sponsors of MIRI (when it was SIAI). At that time, politically, he and Eliezer were known simply as libertarians. This was the world before social media, so politics was more palpably about ideas…
Less Wrong itself was launched during the Obama years, and was designed to be apolitical, but surveys always indicated a progressive majority among the users, with other political identities also represented. At the same time, this was the era in which e.g. Curtis Yarvin’s neoreaction began to attract interest and win adherents in the blogosphere, and there were a few early adopters in the rationalist world, e.g. SIAI spokesperson Michael Anissimov left to follow the proverbial pipeline from libertarianism to white nationalism, and there was the group that founded “More Right”, specifically to discuss political topics banned from Less Wrong in a way combining rationalist methods with reactionary views.
Here we’re approaching the start of the Trump years. Thiel has become Trump’s first champion in Silicon Valley, and David Gerard and the reddit enemies of Less Wrong (/r/sneerclub) have made alleged adjacency to Trump, Yarvin, and “human biodiversity” (e.g. belief in racial IQ differences) central to their critique. At the same time, I would think that the mainstream politics actually suffusing the rationalist milieu at this time, is that of Effective Altruism, e.g. the views of Democrat-affiliated Internet billionaires like Dustin Moskovitz and Sam Bankman-Fried.
Then we have the Covid interlude, rationalists claim epistemological vindication for having been ahead of the curve, and then before you know it, it’s the Biden years and the true era of AI begins with ChatGPT. The complex cultural tapestry of reactions to AI that we now inhabit, starts to take shape. Out of these views, those of the “AI safety” world (heavily identified with effective altruism, and definitely adjacent to rationalism) have some influence on the Biden policy response, while the more radical side of progressive opinion will often show affinity with the “anti-TESCREAL” framing coming from Emile Torres et al.
Meanwhile, as Eliezer turned doomer, Thiel has long since distanced himself, to the point that in 2025, Thiel calls him a legionnaire of Antichrist alongside Greta Thunberg. Newly influential EA gets its nemesis in the form of e/acc, Musk and Andreessen back Trump 2.0, and the new accelerationist “tech right” gets to be a pillar of the new regime, alongside right-wing populism.
In this new landscape, rationalism and Less Wrong still matter, but they are very much not in charge. At this point, the philosophies which matter are those of the companies racing to build AI, and the governments that could shape this process. As far as the companies are concerned, I identify two historic crossroads, Google DeepMind and the old OpenAI. There was a time when DeepMind was the only visible contender to create AI. They had some kind of interaction with MIRI, but I guess you’d have to look to Demis Hassabis and Larry Page to know what the “in-house” philosophy at Google AI was. Then you had the OpenAI project, which continues, but which also involved Musk and spawned Anthropic.
Of all these, Anthropic is evidently the one which (even if they deny it now) is closest to embodying the archetypal views of Effective Altruism and AI safety. You can see this in the way that David Sacks singles them out for particularly vituperative attention, and emphasizes that all Biden’s AI people went to work there. OpenAI these days seems to contain a plurality of views that would range from EA to e/acc, while xAI I guess is governed autocratically by Musk’s own views, which are an idiosyncratic mix of anti-woke accelerationism and “safety via truth-seeking”.
Returning to the rationalist scene events where you see reactionary ideologues, billionaire minions, deep-state specters, and so on, on the guest list… I would guess that what you’re seeing is a cross-section of the views among those working on frontier AI. Now that we are in a timeline where superintelligence is being aggressively and competitively pursued, I think it’s probably for the best that all factions are represented at these events, it means there’s a chance they might listen. At the same time, perhaps something would be gained by also having a purist clique who reject all such associations, and also by the development of defenses against philosophical cooptation, which seems to be part of what you’re talking about.
Thank you, I can’t find anything to complain about in this response. I am even less sympathetic to the anti,-TESCREAL crowd, for the record, I just also don’t consider them dangerous. LessWrong seems dangerous, even if sympathetic, and even if there’s very limited evidence of maliciousness. Effective Altruism seems directionally correct in most respects except maybe at the conjunction of dogmatic utilitarianism and extreme long termism, which I understand to be only a factional perspective within EA. If they keep moving in their overall direction, that is straightforwardly good. If it coalesces at a movement level into a doctrinal set of practices that is bad, even if it gains them scale and coordination. I think Scott Alexander (not a huge fan, but whatever) once said that the difference between a rational expert and a political expert is that one could be replaced by a rock with a directive on it saying to do whatever actions reflect the highest unconditioned probability of success. I’m somewhere between this anxiety, the anxiety about hostile epistemic processes as existing that actively exploit dead players, and LessWrong in particular being on track to at best multiply the magnitude of the existing distribution of happinesses and woes by a very large number and then fix them in place forever, or at worst arm the enemies of every general concept of moral principle with the means to permanently usurp it (leading to permanent misery or the end of consciousness).
I know you have a lot of political critics who do not really engage directly with ideas. I have tried, to an extent that I am not even sure is defensible, to always engage directly with ideas. And my perspectives probably can be found as minority perspectives among respected LessWrong members, but each individual one is already an extreme minority perspective so even the conjunction of three of them probably already doesn’t exist in anyone else. But if I could de-accelerate anything it would be LessWrong right now. It’s the only group of people who would consensually actually do this, and I have presented a rough case for the esoteric arguments for doing so. It’s the only place where the desired behavior actually has real positive expectation. With everything else you just have to hope it’s like the Nazi atomic bomb project at this point and that their bad philosophical commitments and opposition to “Jewish Science” also destroy their practical capacity. You cannot talk Heisenberg in 1943 into not being dangerous. If you really want him around academically and in friendly institutions after the war that’s fine, honestly, the scale of issues is sufficient that it just sort of can’t be risked caring about, but at the immediate moment that can’t be understood as a sane relationship.
What would it mean to decelerate Less Wrong?
Trying to stay focused on things I’ve already said, I guess it would just mean adopting any sort of security posture towards dual use concepts, particularly in regards to the attack of entirely surreptitiously replacing the semantics of an existing set of formalisms to produce a bad outcome, and also de-emphasizing cultural norms favoring coordination to focus more on safety. It’s really just like the lemon market for cars → captured mechanics → captured mechanic certification thing, there just needs to be thinking about how things can escalate. Obviously increased openness could still plausibly be a solution to this in some respects, and increased closedness a detriment. My thinking is just that, at some point AI will have the capacity to drown out all other sources of information, and if your defense for this is ‘I’ve read the sequences”, that’s not sufficient, because the AI has read the sequences too, so you need to think ahead to ‘what could AI, either autonomously or in conjunction with human bad actors, do to directly capture my own epistemic formula, overload them with alternate meaning, then deprecate existing meaning’. And you can actually keep going in paranoia from here, because obviously there are also examples of doing this literal thing that are good, like all of science for example, and therefore not just people who will subvert credulity here but people who will subvert paranoia.
I guess the ultra concise warning would be “please perpetually make sure you understand how scientific epistemics fundamentally versus conditionally differ from dark epistemics so that your heuristics don’t end up having you doing Aztec blood magic in the name of induction”
In another post made since this comment, someone did make specific claims and intermingle them with analysis, and it was pointed out that this can also reduce clarity due to heterogenous background assumptions of different readers. I think the project of rendering language itself unexploitable is probably going to be more complicated than I can usefully contribute to. It might not even be solvable at the level I’m focused on, I might literally be making the same mistake.
Just before the jan 6th riots, Ray Epps was inciting people to go into the Capitol, and lots of people accused him of being a Fed (i.e. an undercover agent).
Now, you can’t prove that, but the game theory is something like..
A) A Federal agent is trying to incite you into committing a crime, and you’re going to arrested if you go along with it
B) He’s an idiot
… and you don’t know which. Regardless of which it actually was, the prudent thing to do here is assume it might be A.
FWIW, this comment isn’t up the standards of comments I want on my post (and my guess is also LW in general, though this isn’t a general site-wide mod comment). Please at least get the formatting right, but even beyond that, it seems pretty naively political in ways that I think tends to be unproductive. (I considered just deleting it, but it seemed better to leave a record of it).
Go ahead, delete it if you don’t think it was a good comment,
It was an honest attempt to think of instances of paranoid uncertainty (protestors don’t know if other protestors are acting in good faith), but sure, delete it if you think it wasn’t up to standard,
All good! I believe you that you were commenting in good faith.
Regarding formatting: I do really recommend you switch out your double paragraph breaks for single paragraph breaks. It looks somewhat broken.
Yes, this does capture something of the experience of the Covid pandemic.
Early, there was genuine uncertainty, questions to which no one really knew the answer.
e.g. How infectious? How deadly? Is it transmissible via surfaces? The answers to some of these questions became apparent fairly quickly,
I can remember listening to virology experts talking about the idea that viruses tend to evolve to be more infectious but less deadly. Expert opinion at the time: would be nice, but not necessarily true for all viruses. As it happened, we got lucky and it did evolve to be more infectious and less deadly.
Then, we get to the point where you start to doubt what the government is saying,
Do masks work? Initially, the story is they weren’t effective. Then masks were compulsory. Was that change motivated by knew knowledge? Unclear. My best guess is that the original version was right, and masks are ineffective. But I have low confidence in this.
Was it a lab leak? Was the government being honest here? Etc.
So I can see why it undermined public trust.