Adam Morris

Karma: 179

I’m a computational cognitive scientist studying decision-making and introspection in humans and LLMs. I earned my PhD at Harvard in 2022, and have been a postdoc at Princeton since then.

Adam Morris 31 Mar 2026 23:55 UTC
8 points
7
on: Experiments With Opus 4.6′s Fiction
I’ll say that I genuinely found this funny (although I’ve been told that I’m easy to please humor-wise 🙂)

EDIT: Okay now that I’ve finished it, I also found it moving! Again, I’m probably easy to please. It’s clearly not as good as the rest of your fiction, but it’s genuinely a better story than I could write.

Adam Morris 29 Nov 2025 2:51 UTC
4 points
1
in reply to: derek shiller’s comment on: Tests of LLM introspection need to rule out causal bypassing
Fascinating point, I think you’re right. Just to repeat your point in my own words: The problem is that, if the activation steering makes the model want to talk about the injected concept, and if it knows that saying “yes, I received an injection” will give it a chance to talk about the concept later in the response, then it will say “yes” in order to talk about the concept later (even if it actually had no metacognitive awareness of the injection). Is that what you’re saying?

Tests of LLM introspection need to rule out causal bypassing

Adam Morris and Dillon Plunkett

28 Nov 2025 17:42 UTC

51 points

6 comments4 min readLW link

Adam Morris 15 Nov 2025 5:29 UTC
1 point
0
in reply to: habryka’s comment on: GradientDissenter’s Shortform
Is there reason to think that Bores or Wiener are not trustworthy or lack integrity? Genuine question, asking because it could affect my donation choices. (I couldn’t tell from your post if there were, e.g., rumors floating around about them, or if you were just using this as an example of a key question that you thought was missed in Neyman’s analysis.)

Self-interpretability: LLMs can describe complex internal processes that drive their decisions

Adam Morris and Dillon Plunkett

14 Nov 2025 0:18 UTC

12 points

0 comments4 min readLW link

Adam Morris 6 Nov 2025 18:41 UTC
1 point
0
in reply to: Zach Stein-Perlman’s comment on: Eric Neyman’s Shortform
Got it. Okay thanks!

Adam Morris 6 Nov 2025 17:19 UTC
3 points
0
in reply to: Eric Neyman’s comment on: Eric Neyman’s Shortform
Earnest question: For both this & donating to Alex Bores, does it matter whether someone donates sooner rather than a couple months from now? For practical reasons, it will be easier for me to donate in 2026--but if it will have a substantially bigger impact now, then I want to do it sooner.

Adam Morris 29 Jul 2025 13:36 UTC
8 points
8
in reply to: NunoSempere’s comment on: NunoSempere’s Shortform
One small suggestion: When I read this, I genuinely couldn’t tell whether “Gray swans: None detected this week” was a joke (like you were pretending to look for literal gray/black swans), or if it meant something serious. After reading your website, my guess is that it’s meant to be serious—but I’m still not sure, and if it is serious then I don’t know what it means. (My understanding is that “black swan” means an unexpected, highly improbable / out of distribution event, so it wasn’t clear to me what it would mean in this context to be generally looking for global gray/black swans.) Might be worth clarifying or finding other terminology, if you want readers like me to quickly grok what you mean.

Adam Morris 1 Jun 2025 16:06 UTC
4 points
0
in reply to: Mark Henry’s comment on: johnswentworth’s Shortform
We haven’t had one yet! But we only did it ~3 times. Obviously people are more careful than they’d normally be while dancing on the slippery floor.

Adam Morris 30 May 2025 13:30 UTC
3 points
0
in reply to: johnswentworth’s comment on: johnswentworth’s Shortform
I’ll add to this list: If you have a kitchen with a tile floor, have everyone take their shoes off, pour soap and water on the floor, and turn it into a slippery sliding dance party. It’s so fun. (My friends and I used to call it “soap kitchen” and it was the highlight of our house parties.)

Printable book of some rationalist creative writing (from Scott A. & Eliezer)

Adam Morris23 Dec 2024 15:44 UTC

10 points

0 comments1 min readLW link

Adam Morris 3 Oct 2024 14:01 UTC
3 points
1
in reply to: aphyer’s comment on: Three Subtle Examples of Data Leakage
I see, that makes sense. Thank you!

Adam Morris 2 Oct 2024 14:41 UTC
25 points
3
in reply to: aphyer’s comment on: Three Subtle Examples of Data Leakage
Can you help me see this point? Why not correct it in the dataset? (Assuming that the dataset hasn’t yet been used to train any models)

Adam Morris 13 Oct 2023 18:12 UTC
1 point
0
on: Print Books of Scott Alexander’s Writing
I’m long overdue here, but thank you so much for doing this!! I’ve been wanting this for a long time and just discovered this post :)

Adam Morris 11 May 2023 1:55 UTC
1 point
0
in reply to: Elizabeth’s comment on: Long Covid Risks: 2023 Update
see my comment above—I (ironically) meant aphasia

Adam Morris 11 May 2023 1:55 UTC
1 point
0
in reply to: lillybaeum’s comment on: Long Covid Risks: 2023 Update
hahaha I actually also meant aphasia :P

Adam Morris 8 May 2023 13:51 UTC
3 points
0
in reply to: lillybaeum’s comment on: Long Covid Risks: 2023 Update
This is ~even more~ anecdotal, but me and several of my friends have noticed increased anosmia since the pandemic, but critically starting before any of us got covid (and including friends who never got it). We conjectured that it could be from some combination of very high stress levels for a long time + social isolation? Just to add some data points to the mix.

[Question] How are you currently modeling COVID contagiousness?

Adam Morris26 Jan 2023 4:46 UTC

2 points

2 comments1 min readLW link

Adam Morris 6 Jan 2022 16:09 UTC
29 points
0
in reply to: Rob Bensinger’s comment on: Animal welfare EA and personal dietary options
Pretty much all the writing I’ve read by Holocaust survivors says that this was not true, that the experience was unambiguously worse than being dead, and that the only thing that kept them going was the hope of being freed. (E.g. according to Victor Frankl in “Man’s Search for Meaning”, all the prisoners in his camp agreed that, not only was it worse than being dead, it was so bad that any good experiences after being freed could not make up for it how bad it was. Why they didn’t kill themselves is an interesting question that he explores a bit in the book.) Are there any Holocaust survivors who claim otherwise?

Adam Morris 15 Dec 2021 5:04 UTC
10 points
0
AF
in reply to: abramdemski’s comment on: There is essentially one best-validated theory of cognition.
Thanks for the thoughtful response, that perspective makes sense. I take your point that ACT-R is unique in the ways you’re describing, and that most cognitive scientists are not working on overarching models of the mind like that. I think maybe our disagreement is about how good/useful of an overarching model ACT-R is? It’s definitely not like in physics, where some overarching theories are widely accepted (e.g. the standard model) even by people working on much more narrow topics—and many of the ones that aren’t (e.g. string theory) are still widely known about and commonly taught. The situation in cog sci (in my view, and I think in many people’s views?) is much more that we don’t have an overarching model of the mind in anywhere close to the level of detail/mechanistic specificity that ACT-R posits, and that any such attempt would be premature/foolish/not useful right now. Like, I think if you polled cognitive scientists, the vast majority would disagree with the title of your post—not because they think there’s a salient alternative, but because they think that there is no theory that even comes close to meriting the title of “best-validated theory of cognition” (even if technically one theory is ahead of the others). Do you know what I mean? Of course, even if most cognitive scientists don’t believe in ACT-R in that way, that alone doesn’t mean that ACT-R is wrong.. I’m curious about the evidence that Terry is talking about above. I just think the field would look really, really different if we actually had a halfway-decent paradigm/overarching model of the mind. And it’s not like ACT-R is some unknown idea that is poised to take over the field once people learn about it. Everyone knew about it in the 90s, and then it fell out of widespread use—and my prior on why that happened is that people weren’t finding it super useful. (Although like I said, I’m really curious to learn more about what Terry/other contemporary people are doing with it!)

Adam Morris

Tests of LLM in­tro­spec­tion need to rule out causal bypassing

Self-in­ter­pretabil­ity: LLMs can de­scribe com­plex in­ter­nal pro­cesses that drive their decisions

Printable book of some ra­tio­nal­ist cre­ative writ­ing (from Scott A. & Eliezer)

[Question] How are you cur­rently mod­el­ing COVID con­ta­gious­ness?

Tests of LLM introspection need to rule out causal bypassing

Self-interpretability: LLMs can describe complex internal processes that drive their decisions

Printable book of some rationalist creative writing (from Scott A. & Eliezer)

[Question] How are you currently modeling COVID contagiousness?