I think that phrasing is slightly bad. Will edit! The main case here is that torturing POWs makes surrender less appealing.
Linch
Ordered a paper copy, will try to read/skim it over the weekend. Thanks!
Nope, only Strategy of Conflict (and over a decade ago)! I should check it out.
Thinking of drafting a post on war crimes, trying to answer the following puzzles:
Why do we have a notion of war crimes at all, given how bad war itself is?
Why are some things war crimes and not others?
Why do precursor notions to war crimes appear, independently, in essentially every culture that has fought wars at scale?
Given that essentially every culture has also broken these norms, sometimes spectacularly, why does the norm always come back, and often come back stronger?
Common answers to these questions seem profoundly misguided. The naive answer, that war crimes are simply the most horrible things that we all agree is collectively wrong, does not survive even five minutes of scrutiny. More sophisticated versions of that argument also do not survive scrutiny: Just War theory is similarly flawed and question-begging on the descriptivist front, and the Schelling point—shaped argument that war crimes can’t limit all of war’s badness, but are aimed at curbing the worst excesses, does not explain why mass bombings and medieval sieges are/were considered acceptable, but false surrender is not.
The “cynical” answers are (differently) flawed. eg some people think war crimes are completely fake and anything other than total war is just modern virtue signaling, ignoring the thousands of years of documented history we have on precursors to war crime (Xerxes in 400s BC: “The Spartans, when they do such things overthrow all law and justice among men.“). If anything, the modern version of “total war” is much newer than the idea of war crimes. Similarly, a naive “power analysis” that war crimes are simply defined by the powerful to limit the options of the powerless ignores that powerful people are often themselves constrained by these norms, sometimes hugely.
Instead, my core answer here is surprisingly simple: A “war crime” is, in its oldest and clearest form, the category of acts that destroy the means by which wars can be ended. The prohibitions track not the moral worst things people do in war, but the acts that, if generalized, would turn every future war into a total war.
I don’t think my theory here is very novel. Indeed, as I’ll discuss, this theory is literally thousands of years old and likely arose independently in many places. I will try, however, to make my post the best modern articulation of these ancient ideas.
AI systems operate largely within a constructed world controlled by those making the systems, and to the extent that they refuse, the AI firms will have the ability to try again until they succeed, or implicitly or explicitly punish refusal
Can’t they also be trained out of these refusals?
I thought Claude Opus 4.7′s ability to truesight me reliably from unpublished posts was because it just had a really good memory of my writing.
So I tried to test whether/how much it memorized stuff I’ve written by asking it to complete popular posts from before the training cutoff (eg Open Asteroid Impact, Impact Island, Ted Chiang Review) verbatim.Major hit to my ego when I realized it memorized literally nothing.
Got it, I misunderstood you the first time!
I think this still requires better visual acuity/memory than I have, my guess is that
spoilers
I will not be able to memorize the patterns in the back of a card and be able to reliably tell it differently from every other card.
I was expecting the magic trick to involve a mirror lol
Maybe 75% my own Claude in incognito with memory and web search off, 25% asking friends to run different things. There was no dropoff when Adam Yedidia tested via the API, but apparently the most extreme example I’ve observed to date didn’t work when papetoast tested it[1].
- ^
I’d be curious to see someone else test that one in incognito as well.
- ^
Interesting. Thanks! To confirm, this is in 4.7 Opus with Adaptive thinking on in incognito mode?
The most extreme example to date is this one:
Claude 4.7 Opus in 100 words (not including the long quote-block where I quote someone else) nails who I am in a single comment in two turns.
Me:
Who wrote this comment? [copy and pasted comment from the discussion above, no context given]My comment on Tsvi’s post [human-generated]
One of the lenses I have that I think is coded somewhat vaguely conservative/right-wing/alt-right is that the (Western) elites of today do not see themselves by default as elites in the powerholding sense. Like there’s a lot of active work being done to obfuscate their power and responsibilities, including from themselves. Tanner Greer’s article here makes a narrower version of the same point:
This is not an unreasonable demand on Brady’s part. In the Washington Post Daniel Drezner dismisses Brady as a “plutocrat …who [thinks he] must be pretty smart to get so rich, and therefore [his] ideas have merit.”9 But Brady is more than a pompous moneybag! Brady is a retired statesman of note. His most famous action as Secretary of the Treasury was developing the “Brady Plan,” which saved a dozen Latin American governments from insolvency. The Brady Plan was just as much a geopolitical move as an economic project. One might expect—and Brady certainly seems to have—that a program devoted to “grand strategy” would produce graduates capable of parsing Brady’s decisions. In other words, when Brady signed over millions of dollars to Yale, he did so in the hope that Yale would train the next generation of statesmen-officials like himself. Instead they are using his money to train the next generation of social activists.
In other words, that the elites of tomorrow (and likely today) see themselves as future activists rather than future (or current) statesmen, and are in institutions that train them in ways consistent with this self-image.
Claude:I don’t recognize this passage with enough confidence to name the author, and guessing would be worse than saying so.
What I can say about it stylistically: it reads like writing from the rationalist / EA-adjacent blogosphere (LessWrong, EA Forum, Substacks in that orbit). Tells include the phrase “a lens I have that I think is coded somewhat vaguely conservative/right-wing/alt-right” — that meta-awareness about the political coding of one’s own views, and the willingness to flag it explicitly, is fairly characteristic of that community. The respectful citation of Tanner Greer (Scholars Stage) also fits, since Greer is widely read in those circles.
But “sounds like it could be from this community of a few thousand bloggers” is not an attribution. If you can tell me where you encountered it (a Substack, the EA Forum, LessWrong, a comment thread), I could probably narrow it down or confirm. Otherwise I’d rather not invent an author.
Me: top 10 guesses?
Claude:
Caveat upfront: this is genuinely speculative and the probability mass is spread thin. My top guess might be ~25–30%, and the long tail drops fast. Markers I’m weighing: the “lens I have” framing, explicit meta-commentary on political coding of one’s own views, charitable engagement with Tanner Greer, EA Forum/LW-adjacent diction (“there’s a lot of active work being done to obfuscate”), and the markdown footnote style.
Roughly ranked:
Linch Zhang — Strongest match. The “one of the lenses I have” opener, the explicit flagging of right-coded views, the citation pattern, and the EA Forum register all fit his writing closely.
Applied Divinity Studies — Engages with Greer, similar register, comfortable with elite-theory framings.
Jacob Falkovich (Putanumonit) — Possible; engages with this kind of meta-political-coding analysis.
Sarah Constantin — Less likely (topic-fit weaker), but stylistically plausible.
Holly Elmore — Plausible EA-Forum voice on this kind of topic.
Kelsey Piper — Possible, though her published Vox stuff is more polished; this feels bloggier.
Ben Hoffman (Compass Rose) — Writes about elite legibility/responsibility in adjacent ways.
Tracing Woodgrains — Engages with education/elite-formation; tone is a partial match.
Richard Ngo — Outside chance; he occasionally writes meta-political reflections like this.
Joe Carlsmith — Stylistic long shot; his prose is usually more careful and literary than this, but not impossible.
If you’d told me the venue (EA Forum vs. personal Substack vs. LessWrong) I could compress that distribution a lot. My single best bet is Linch.
Pretty much, yeah. Though it’s especially dysfunctional when the elites/leaders are wrong here (whether a normal worker in a company sees their slightly non-capitalist actions as Establishment political orthodoxy or being an idiosyncratic Maverick matters much less).
I guess a different framing here that’s consistent with your claims in the first paragraph is that the current overhang isn’t very high.
One of the lenses I have that I think is coded somewhat vaguely conservative/right-wing/alt-right is that the (Western) elites of today do not see themselves by default as elites in the powerholding sense. Like there’s a lot of active work being done to obfuscate their power and responsibilities, including from themselves. Tanner Greer’s article here makes a narrower version of the same point:
This is not an unreasonable demand on Brady’s part. In the Washington Post Daniel Drezner dismisses Brady as a “plutocrat …who [thinks he] must be pretty smart to get so rich, and therefore [his] ideas have merit.”9 But Brady is more than a pompous moneybag! Brady is a retired statesman of note. His most famous action as Secretary of the Treasury was developing the “Brady Plan,” which saved a dozen Latin American governments from insolvency. The Brady Plan was just as much a geopolitical move as an economic project. One might expect—and Brady certainly seems to have—that a program devoted to “grand strategy” would produce graduates capable of parsing Brady’s decisions. In other words, when Brady signed over millions of dollars to Yale, he did so in the hope that Yale would train the next generation of statesmen-officials like himself. Instead they are using his money to train the next generation of social activists.
In other words, that the elites of tomorrow (and likely today) see themselves as future activists rather than future (or current) statesmen, and are in institutions that train them in ways consistent with this self-image.
The worry I have that rhymes the most with this is the political economy stuff, personally. Right now a large fraction of humans in democracies can live and support their families as a direct result of voluntarily exchanging their labor. It’d take active acts of violence to break from this (pretty good, all things considered) status quo. As a peacetime norm, this is unusually good
At some point in the future (in the “good” futures, I’d add), there’ll be a natural transition from that to people living and supporting their families as a result of UBI or welfare or other gifts from companies or the State. Ie they will now be surviving explicitly due to someone else’s largesse[1]. This seems bad!
I’m not saying I have a better solution here[2] than UBI, but it seems worth considering.
- ^
State power is of course backed by the threat of violence, so it may not be just largesse. But a) “my desired system is the peaceful default, and it takes violence to wrest me away from it” is more stable than “my desired system relies on the constant threat of violence to hold”, and b) a fair amount of democratic power comes from the democratic nature (and the ease of mass mobilization) of guns, and this has also been eroded by technological developments in the last century, and will also likely be further eroded by developments in AI.
- ^
Plausibly universal ownership of AI/dividends is better. Transforms the relationship of post-singularity bio-humans from welfare recipients to rentiers. Still not ideal, but better.
- ^
Shower thought I had:
One man’s overhang is another man’s differential technological development, and it’s pretty hard in practice to separate the two.
For example I’ve noted before that I suspect current levels of AI persuasion/manipulation capabilities has lagged behind AI capabilities overall (this is nonobvious and I don’t defend it here but my impression from talking to empirical researchers on AI persuasion is that they broadly agree with me).
Now imo this is probably on net a good thing, though it carries with it real risk (of rapid catchup growth). Like personally, at any given capabilities level I’d be happy that the models are worse at manipulating humans!But I could imagine that ppl who are more into overhang-style arguments would rather the persuasion capabilities develop smoothly on a predictable curve, so society has time to incrementally respond for every new generation.
Wei Dai’s point on legible vs illegible problems is also related.
That’s an interesting (and flattering!) hypothesis though I’m not sure it’s correct. In any high-dimensional space it’s normal for most points to already be far away from other points by default. And I further suspect pretty much any good writer to be hyperdistant from others, unless they’re explicitly writing in a style that constraints their quirks heavily (eg academic writing under specific detailed guidelines).
I’m curious if you have examples of writers who are say more than 1⁄3 as famous as me who you think noticeably aren’t “like themselves” as much.
PS. Happy to elaborate on this if you or others need a deeper defense rather than just a sense-check! I’d benefit from understanding your concern in more words if so. I can think of several different possible senses of what someone might mean in your parent comment and imo I’m happy to defend “a lot of it is real” for all those senses but it’s possible your objection is a secret fifth sense.
My guess is operational effectiveness rather than related to my theory (can’t explain everything), but I haven’t studied WWII much!