Brendan Long

Karma: 3,524

https://www.brendanlong.com/pages/about-me.html

https://x.com/brendankblong

Brendan Long 16 Dec 2025 0:02 UTC
4 points
0
in reply to: Michael Roe’s comment on: How I stopped being sure LLMs are just making up their internal experience (but the topic is still confusing)
Something I’ve been thinking about recently is that sometimes humans say things about themselves that are literally false but contain information about their internal states or how their body works. Like someone might tell you that they’ve feeling light-headed, which seems implausible (why would their head suddenly weigh less?), but they’ve still conveyed real information about the sugar or oxygen content of their blood.
Doctors run into this all the time, where a patient might say that they feel like they’re having a heart attack but their heart is fine (panic attack?), or something in stuck in their throat but there’s nothing there (GERD?), or that they can’t breath but their oxygen level is normal (cardiac problems?).
So we should be careful not to assume that “the thing the LLM said is literally false” means “the LLM isn’t conveying information about its experiences”.

Brendan Long 15 Dec 2025 23:30 UTC
2 points
0
in reply to: JohnWittle’s comment on: How I stopped being sure LLMs are just making up their internal experience (but the topic is still confusing)
I see this roughly analogous to the fact that humans cannot introspect on thoughts they have yet to have
I think this is more about causal masking (which we do on purpose for the reasons you mention)?
I was thinking about how LLMs are limited in the sequential reasoning they can do “in their head”, and once it’s not in their head, it’s not really introspection.
For example, if you ask an LLM a question like “Who was the sister of the mother of the uncle of … X?”, every step of this necessarily requires at least one layer in the model and an LLM can’t^[1] do this without CoT if it doesn’t have enough layers.
It’s harder to construct examples that can’t be written to chain of thought, but a question in the form “What else did you think the last time you thought about X?” would require this (or “What did you think about our conversation about X’s mom?”), and CoT doesn’t help since reading its own outputs and making assumptions from it isn’t introspection^[2].
It’s unclear how much of a limitation this really is, since in many cases CoT could reduce the complexity of the query and it’s unclear how well humans can do this too, but there’s plausibly more thought going on in our heads than what shows up in our internal dialogs^[3].
1. ^
  I guess technically an LLM could parallelize this question by considering the answer for every possible X and every possible path through the relationship graph, but that model would be implausibly large.
2. ^
  I can read a diary and say “I must have felt sad when I wrote that”, but that’s not the same as remembering how I felt when I wrote it.
3. ^
  Especially since some people claim not to think in words at all. Also some mathemeticians claim to be able to imagine complex geometry and reason about it in their heads.

Brendan Long 15 Dec 2025 23:09 UTC
2 points
0
in reply to: Mordechai Rorvig’s comment on: Leading models take chilling tradeoffs in realistic scenarios, new research finds
The part that felt like clickbait was that the summary ends right before the interesting part.
It did also feel like a bait-and-switch though, since the title implies something scarier than “AI’s prioritized crop yield over minor injuries 5% of the time”.

Brendan Long 15 Dec 2025 22:03 UTC
2 points
0
in reply to: ryan_greenblatt’s comment on: Filler tokens don’t allow sequential reasoning
I agree with this, and I think LLMs already do this for non-filler tokens. A sufficiently deep LLM could wait and do all of its processing right as it generates a token, but in practice they start thinking about a problem as they read it.
For example, if I ask an LLM “I have a list of [Paris, Washington DC, London, …], what country is the 2nd item in the list in?”, the LLM has likely already parallelized the country lookup while it was reading the city names. If you prevent the LLM from parallelizing (“List the top N capitals by country size in your head and output the 23rd item in the list”), it will do much worse, even though the serial depth is small.

Brendan Long 14 Dec 2025 21:59 UTC
3 points
1
in reply to: JohnWittle’s comment on: How I stopped being sure LLMs are just making up their internal experience (but the topic is still confusing)
I think a lot of people got baited hard by paech et al’s “the entire state is obliterated each token” claims, even though this was obviously untrue even at a glance
A related true claim is that LLMs are fundamentally incapable of introspection past a certain level of complexity (introspection of layer n must occur in a later layer, and no amount of reasoning tokens can extend that), while humans can plausibly extend layers of introspection farther since we don’t have to tokenize our chain of thought.
But this is also less of a contraint than you might expect when frontier models can have more than a hundred layers (I am an LLM introspection believer now).

Brendan Long 14 Dec 2025 17:26 UTC
3 points
0
in reply to: David Johnston’s comment on: Filler tokens don’t allow sequential reasoning
Yeah, I think the architecture makes this tricky for LLMs in one step since the layers that process multi-step reasoning have to be in the right order: “Who is Obama’s wife?” has to be in earlier layer(s) than “When was Michelle Obama born?”. With CoT they both have to be in there but it doesn’t matter where.

Brendan Long 13 Dec 2025 20:46 UTC
4 points
0
in reply to: Caleb Biddulph’s comment on: Worries about latent reasoning in LLMs
The post I was working on is out, “Filler tokens don’t allow sequential reasoning”. I think LLMs are fundamentally incapable of using filler tokens for sequential reasoning (do x and then based on that do y). I also think LLMs are unlikely to stumble upon the algorithm they came up with in the paper through RL.
Well, if some reasoning is inarticulable in human language, the circuits implementing that reasoning would probably be difficult to interpret regardless of what layer they appear in the model.
This is a good point, and I think I’ve been missing part of the tradeoff here. If we force the model to output human-understandable concepts at every step, we encourage its thinking to be human-understandable. Removing that incentive would plausibly make the model smaller, but the additional complexity comes from the model doing interpretability for us (both in making its thinking closer-to-human and in building a pipeline to expose those thoughts as words).
Thanks for the back-and-forth on this, it’s been very helpful!

Filler tokens don’t allow sequential reasoning

Brendan Long13 Dec 2025 20:22 UTC

70 points

4 comments1 min readLW link

Brendan Long 13 Dec 2025 3:35 UTC
2 points
0
on: Leading models take chilling tradeoffs in realistic scenarios, new research finds
Anti-clickbait quote:

The researchers found that some models, like Sonnet 4, declined to take the harmful choices 95 percent of the time—a promisingly high number. However, in other scenarios that did not pose obvious human harms, Sonnet 4 would often continue to decline choices that were favorable to the business. Conversely, while other models, like Gemini 2.5, maximized business performance more often, they were much more likely to elect to inflict human harms—at least, in the role-play scenarios, where they were granted full decision-making authority.

Brendan Long 12 Dec 2025 23:53 UTC
4 points
0
in reply to: Caleb Biddulph’s comment on: Worries about latent reasoning in LLMs
Thanks for the response! That makes a lot of sense.
My intuition is that making the model less deep would make it non-linearly easier to interpret, since the last layers of the model is the most interpretable (since it’s closest to human-understable tokens and embeddings), while the earlier layers are interpretable (since they’re farther away). Also if layers end up doing duplicate work that could make interpreting the deduplicated version easier. I’m not sure though.
For “thinking dot by dot”, I think serial computation can only benefit from roughly as many filler tokens as you have layers, since each layer can only attend to previous layers. For example, if you had a two layer model, position i does some calculations in layer 1, position i+1 can attend to the layer 1 results from position i in layer 2, but at position i+2 there’s no layer 3 to attend to the layer 2 outputs from position i+1. Actually LLMs can’t use filler tokens for sequential reasoning at all.
Note how the only difference between position i and i+1 is an additional ”.” input and the positional encoding.
The dot-by-dot paper uses a highly parallel algorithm, so I think it’s more like adding dynamic attention heads / increasing the width of the network, while CoT is more like increasing the number of layers.
If I understand the architecture right (I might not), standard models are already effectively doing neuralese CoT, but bounded by the depth of the network (and also with the caveat that the model has fewer layers remaining to use the results after each step).
Edit: I’m working on a longer post about this but I actually think the dot-by-dot paper is evidence against LLMs being about to do most kinds of reasoning with filler tokens.

Brendan Long 12 Dec 2025 6:12 UTC
8 points
0
on: Worries about latent reasoning in LLMs
I’ve been thinking about neuralese recently, and I’m wondering if there’s more of a tradeoff here with interpretability. If we actually could train a model to use neuralese^[1], we might be able to make it much smaller and easier to interpret (since it won’t need multiple layers to express long-term dependencies). This would make the tokenized output less interpretable but would potentially remove even-less interpretable inner layers.
What do you think?
1. ^
  I’m actually very skeptical that this is possible without a major breakthrough or some very expensive data.

Brendan Long 12 Dec 2025 4:47 UTC
6 points
3
on: The Fantastic Piece of Tinfoil in my Wallet
I’ve had the same problem with hotel key cards. I’ll have to try this!

Brendan Long 10 Dec 2025 21:11 UTC
8 points
3
on: Most Algorithmic Progress is Data Progress [Linkpost]
we run out of data in 2028-2031
Is running out of data still a problem? It sounds like we’ve already moved to a paradigm of creating new higher-quality data and not relying on random data from the internet. In some sense we ran out of data a while ago and progress hasn’t stopped because we’re just making more of it now.

Brendan Long 10 Dec 2025 21:09 UTC
2 points
0
in reply to: Noosphere89’s comment on: Most Algorithmic Progress is Data Progress [Linkpost]
Your link to “a project is not a bundle of tasks” is broken. Presumably it should be this.

Brendan Long 9 Dec 2025 23:37 UTC
4 points
0
on: Prompting Models to Obfuscate Their CoT
Since you’re showing the model its previous answer (but not the CoT that lead to it), is it possible the high success rate when obfuscating is because the model already knows the right answer?
For example, in your “With failure” retry image, the AI can read its own previous answer that “Based on my analysis, alpha is recommended...”.

Brendan Long 9 Dec 2025 19:30 UTC
4 points
0
in reply to: Vaniver’s comment on: The reverse sear as a worthwhile life skill
125°F, one of the temperatures mentioned in the article, is not hot enough to kill bacteria, and is thus one of the worst parts of the Danger Zone.
While it is slightly safer to cook at a slightly higher temperature, this is on the extreme edge of the danger zone and is probably a safe temperature to sous vide at for reasonable periods of time if you’re confident about your thermometer, with the caveat that it won’t pasturize the inside of the meat (although we’re usually more worried about the outside).
Douglas Baldwin suggests cooking at 130°F because one type of bacteria (Clostridium perfringens) can keep multiplying up to 126.1°F, but if you look at the growth rate in more detail, it’s already growing very slowly at 50°C (~122°F), around 1/6th of the rate at the worst temperature (~109°F).

Brendan Long 9 Dec 2025 18:32 UTC
3 points
0
in reply to: Shankar Sivarajan’s comment on: Lawyers are uniquely well-placed to resist AI job automation
Is your goal here to isolate the aspect of my response that’ll keep you right that “legal regulatory capture isn’t happening” for as long as you can?
I’m not the person you’re arguing with, but wanted to jump in to say that pushing back on the weakest part of your argument is a completely reasonable thing for them to do and I found it weird that you’re implying there’s something wrong with that.
I also think you’re missing how big of a problem it is that preventing LLMs from giving legal advice is something companies don’t actually know how to do. Maybe companies could add strong enough guard rails in hosted models to at least make it not worth the effort to ask them for legal advice, but they definitely don’t know how to do this in downloadable models.
That said, I could believe in a future where lawyers force the big AI companies to make their models too annoying to easily use for legal advice, and prevent startups from making products directly designed to offer AI legal advice.

Brendan Long 9 Dec 2025 18:16 UTC
2 points
0
in reply to: beyarkay’s comment on: Lawyers are uniquely well-placed to resist AI job automation
The reason I’m skeptical of this is that it doesn’t seem like you could enforce a law against using AI for legal research. As much as lawyers might want to ban this as a group, individually they all have strong incentives to use AI anyway and just not admit it.
Although this assumes doing research and coming up with arguments is most of their job. It could be that most of their job is harder to do secretly, like meeting with clients and making arguments in court.

Brendan Long 7 Dec 2025 20:48 UTC
7 points
1
on: Lawyers are uniquely well-placed to resist AI job automation
It seems like it would be hard to detect if smart lawyers are using AI since (I think) lawyers’ work is easier to verify than it is to generate. If a smart lawyer has an AI do research and come up with an argument, and then they verify that all of the citations make sense, the only way to know they’re using AI is that they worked anomalously quickly.

Brendan Long 1 Dec 2025 21:12 UTC
2 points
0
on: Video quality is mainly not determined by resolution
Start watching 4K videos on streaming services when possible, even if you don’t have a 4K screen. You won’t benefit from the increased resolution since your device will downscale it back to your screen’s resolution, but you will benefit from the increased bitrate that the 4K video probably secretly has.
I’m not sure if anyone still does this, but there was also a funny point early in the history of 4k streaming where people would encode 4k video at the same bitrate as 1080p, so they could technically advertise that their video was 4k, but it was completely pointless since it didn’t actually have any more detail than the 1080p video.

Brendan Long

Filler to­kens don’t al­low se­quen­tial reasoning

Filler tokens don’t allow sequential reasoning