Linch

Karma: 4,490

The Matchless Match

Linch30 Jan 2026 21:18 UTC

11 points

3 comments11 min readLW link

Linch 21 Jan 2026 20:36 UTC
6 points
2
in reply to: Richard_Ngo’s comment on: ricraz’s Shortform
I think this sort of assumes that terminal-ish goals are developed earlier and thus more stable and instrumental-ish goals are developed later and more subject to change.
I think this may or may not be true on the individual level but it’s probably false on the ecological level.
Competitive pressures shape many instrumental-ish goals to be convergent whereas terminal-ish goals have more free parameters.

Linch 21 Jan 2026 18:18 UTC
2 points
0
in reply to: Kaarel’s comment on: Linch’s Shortform
I suspect describing AI as having “values” feels more alien than “goals,” but I don’t have an easy way to figure this out.

Linch 21 Jan 2026 6:43 UTC
2 points
1
in reply to: Adele Lopez’s comment on: Linch’s Shortform
whynotboth.jpeg

Linch 20 Jan 2026 23:14 UTC
23 points
1
on: Linch’s Shortform
Here’s my current four-point argument for AI risk/danger from misaligned AIs.
- We are on the path of creating intelligences capable of being better than humans at almost all economically and militarily relevant tasks.
- There are strong selection pressures and trends to make these intelligences into goal-seeking minds acting in the real world, rather than disembodied high-IQ pattern-matchers.
- Unlike traditional software, we have little ability to know or control what these goal-seeking minds will do, only directional input.
- Minds much better than humans at seeking their goals, with goals different enough from our own, may end us all, either as a preventative measure or side effect.
Request for feedback: I’m curious whether there are points that people think I’m critically missing, and/or ways that these arguments would not be convincing to “normal people.” I’m trying to write the argument to lay out the simplest possible case.

Linch 20 Jan 2026 23:07 UTC
2 points
0
in reply to: Amalthea’s comment on: Linch’s Shortform
Yeah I believe this too. Possibly one of the relatively few examples of the midwit meme being true in real life.

Linch 19 Jan 2026 18:50 UTC
20 points
2
on: Linch’s Shortform
What are people’s favorite arguments/articles/essays trying to lay out the simplest possible case for AI risk/danger?
Every single argument for AI danger/risk/safety I’ve seen seems to overcomplicate things. Either they have too many extraneous details, or they appeal to overly complex analogies, or they seem to spend much of their time responding to insider debates.
I might want to try my hand at writing the simplest possible argument that is still rigorous and clear, without being trapped by common pitfalls. To do that, I want to quickly survey the field so I can learn from the best existing work as well as avoid the mistakes they make.
What links here?
- Linch's comment on Linch’s Shortform by Linch (20 Jan 2026 23:14 UTC; 23 points)

Linch 13 Jan 2026 9:40 UTC
12 points
7
in reply to: lilkim2025’s comment on: Strong, bipartisan leadership for resistance to Trump.
Ashley Babbit, an unarmed <protestor or rioter, depending on your party affiliation>
I appreciate your attempt to be charitable, but I don’t think the left-wing/liberal concerns with Jan 6 is appropriately summarized as “riot.”

Linch 12 Jan 2026 23:02 UTC
3 points
0
in reply to: David Johnston’s comment on: Strong, bipartisan leadership for resistance to Trump.
Alas, no my model is rather limited.

Linch 12 Jan 2026 21:28 UTC
31 points
24
in reply to: Joel Burget’s comment on: Strong, bipartisan leadership for resistance to Trump.
I would not trust Dean Ball as a trustworthy actor acting on the object-level, and certainly would not take any of his statements at face value! I think it’s much better to model him as a combination of a political actor saying whatever words causes his political aims to be achieved, plus someone willing to pursue random vendettas.

Linch 7 Jan 2026 8:47 UTC
6 points
2
on: In My Misanthropy Era
I’d recommend trying to talk to people 1:1, especially about topics that are more in their wheelhouses than in yours’. At least I’ve found my average conversation with Uber drivers to be more interesting and insightful than reading my phone.
My guess is that I do this more than you do, but one thing I find unpleasant about interacting with large groups of people I don’t know well is that I wind up doing a bunch of semi-conscious theory-of-mind modeling, emotional regulation-type management of different levels of a conversation, etc ^[1], so it’s harder for me to focus on the object-level. ^[2]I think this is much less of a problem in 1:1 conversations where maintaining the multilevel tracking feels quite natural.
1. ^
  It’s unclear to me if I do this more or less than “normies.” The case for “less” is that I don’t think I’ve spent a lot of my skillpoints on people modeling compared to other things. The case for “more” is that often people I interact with have almost laughably simplistic or non-existent models of other people.
2. ^
  I would not be surprised if I specifically happen to be in a midwit part of the curve, alas.

Linch 2 Jan 2026 3:30 UTC
2 points
0
in reply to: Eli Tyre’s comment on: Linch’s Shortform
Check out https://linch.substack.com/p/unknown-knowns

Linch 1 Jan 2026 0:56 UTC
12 points
0
in reply to: plex’s comment on: Linch’s Shortform
I personally find the “virtue is good because bounded optimization is too hard” framing less valuable/persuasive than the “virtue is good because your own brain and those of other agents are trying to trick you” framing. Basically, the adversarial dynamics seem key in these situations, otherwise a better heuristic might be to focus on the highest order bit first and then go down the importance ladder.
Though of course both are relevant parts of the story here.

Linch 31 Dec 2025 21:06 UTC
3 points
0
in reply to: Garrett Baker’s comment on: Linch’s Shortform
Thanks, this is a helpful point! The second one has been on my mind re: assassinations, and is implicitly part of my model for uncertainty about assassination effectiveness (I still think my original belief is largely correct, but I can’t rule out psy ops)

Linch 30 Dec 2025 22:48 UTC
73 points
45
on: Linch’s Shortform
I often see people advocate others sacrifice their souls. People often justify lying, political violence, coverups of “your side’s” crimes and misdeeds, or professional misconduct of government officials and journalists, because their cause is sufficiently True and Just. I’m overall skeptical of this entire class of arguments.
This is not because I intrinsically value “clean hands” or seeming good over actual good outcomes. Nor is it because I have a sort of magical thinking common in movies, where things miraculously work out well if you just ignore tradeoffs.
Rather, it’s because I think the empirical consequences of deception, violence, criminal activity, and other norm violations are often (not always) quite bad, and people aren’t smart or wise enough to tell the exceptions apart from the general case, especially when they’re ideologically and emotionally compromised, as is often the case.
Instead, I think it often helps to be interpersonally nice, conduct yourself with honor, and overall be true to your internal and/or society-wide notions of ethics and integrity.
I’m especially skeptical of galaxy-brained positions where to be a hard-nosed consequentialist or whatever, you are supposed to do a specific and concrete Hard Thing (usually involving harming innocents) to achieve some large, underspecified, and far-off positive outcome.
I think it’s like those thought experiments about torturing a terrorist (or a terrorist’s child) to find the location of the a ticking nuclear bomb under Manhattan where somehow you know the torture would do it.
I mean, sure, if presented that way I’d think it’s a good idea but has anybody here checked the literature on the reliability of evidence extracted under torture? Is that really the most effective interrogation technique?
So many people seem eager to rush to sell their souls, without first checking to see if the Devil’s willing to fulfill his end of the bargain.
(x-posted from Substack)

The 7 Types Of Advice (And 3 Common Failure Modes)

Linch30 Dec 2025 21:55 UTC

26 points

3 comments7 min readLW link

(inchpin.substack.com)

Linch 28 Dec 2025 5:07 UTC
6 points
0
on: Linch’s Shortform
One dispositional difference between me and other ppl is that compared to other people, if Bob says statement X that’s false and dumb, I’m much more likely to believe that Bob did not meaningfully understand something about X.
I think other people are much more likely to jump to “Bob actually has a deeper reason Y for saying X” if they like Bob or “Bob is just trolling” if they dislike Bob.
Either reason might well be true, but a) I think often they are not true, and b) even if they are, I still think most likely Bob didn’t understand X. Even if X is not Bob’s true rejection, or if Bob would’ve found a different thing to troll about if he understood X better.
This is one of the reasons I find it valuable to proffer relatively simple explanations for complex concepts, at least sometimes.

Linch 28 Dec 2025 4:04 UTC
2 points
0
in reply to: Pablo’s comment on: Unknown Knowns: Five Ideas You Can’t Unsee
Thanks, this is helpful.
Schelling’s specific point actually feels relevant to me and a blindspot among (at least some) rationalists or EAs when they talk about “conflict” vs “mistake” theory. I’ve recently thought about the “conflict vs mistake theory” framing some more, and think it misses out a lot of the learnings that are standard in, eg, negotiation classes or bargaining theory, or international relations/game theory writ large.
I think a lot of the time a better position is something roughly like: “I have my interests and intend to pursue mine own interests to the best of my ability. I respect you as an agent with your interests and willing to pursue yours. Sometimes our interests come into conflict, and we take actions detrimental to each other. However, it is implausible that our interests are directly opposed, and there are often plausible gains from trade.”
A plausible example of mistake theory inhibiting gains from trade is when (supposedly) Obama often tried to lecture Republican lawmakers about their mistakes, instead of taking their interests as a given and tried to negotiate more.
Of course, conflict theory can inhibit gains from trade if it prevents people from coming to the negotiation table, or just not notice that bargaining is almost always a better option than war.

Linch 27 Dec 2025 18:35 UTC
2 points
0
in reply to: Nick_Tarleton’s comment on: Unknown Knowns: Five Ideas You Can’t Unsee
Thanks for the Wiki article, was helpful to read!
Yeah, it’s hard to balance the examples well. The most common examples of being wrong about X are often not the most central/clean examples of being wrong about X. This was also an issue for me in the Theory of Mind examples (neurotypical adults have at least some ToM in the developmental psychology sense, some of the most common failures are more sophisticated failures like typical mind fallacy^[1], but in a sense, neither are the most interesting examples to bring up).
For me, an interesting example of Grice’s maxims not being fully integrated is this post, which argues that you need to understand postmodern philosophy to get why “Stating true facts sometimes makes people angrier rather than updating their beliefs,” whereas in practice I think in many (not all) cases, the people “stating true facts” and “just asking questions” that predictably make people angry are failing to integrate Grice’s maxims on a normative level, and/or have poor theory of mind on a descriptive level.
Obviously there’s more than one way to surmount a mountain, and continental philosophy has other teachings and benefits as well, so I don’t want to begrudge people too much for becoming better at strategic empathy and conversational pragmatics through continental philosophy rather than the tools I’m more familiar with. But it does feel like overkill to me, and unfortunately continental philosophy seems to shackle people with other commitments and attitudes.
1. ^
  The problem with “typical mind fallacy” as a prototypical example of a cognitive error is that in many cases it can also be written correctly as a typical mind heuristic.

Linch 27 Dec 2025 18:16 UTC
3 points
0
in reply to: Tapatakt’s comment on: Unknown Knowns: Five Ideas You Can’t Unsee
Yeah, after getting enough people tripped up/upset at me about invoking IVT-like intuitions for discontinuous functions^[1], I suspect something like the above is the subtler point I should’ve led with. Elsewhere I wrote I think part of the argument is that if you have a complicated distribution between a bunch of unknown discontinuous functions “in reality”, from your epistemic state, it would often essentially look continuous to you when you combine the probabilities together, and you should treat them as such.
I think your formalism is helpful/might aid in thinking more clearly, but I’m also worried people would jump at it if their uncertainty is slightly non-uniform (without noticing that changing the math a tiny bit only changes the endline result a tiny bit).
In a lot of situations you can still treat your situation as locally linear despite non-global uniformity (see my third point on differentiable functions being locally linear), but that argument is more about “negotiating price,” my first (IVT-inspired) point was establishing that it’s possible to have an effect at all.
Total agree with the train example being a clear elucidation. I’ve used it before in other contexts when trying to explain EV-style reasoning more directly.
1. ^
  Obviously IVT doesn’t hold for all discontinuous functions. But IVT-style intuitions still hold up for reasons like the ones you illustrate, most of the time.

Linch

The Match­less Match

The 7 Types Of Ad­vice (And 3 Com­mon Failure Modes)

The Matchless Match

The 7 Types Of Advice (And 3 Common Failure Modes)