RSS

Dillon Plunkett

Karma: 55

Tests of LLM in­tro­spec­tion need to rule out causal bypassing

28 Nov 2025 17:42 UTC
49 points
6 comments4 min readLW link

Self-in­ter­pretabil­ity: LLMs can de­scribe com­plex in­ter­nal pro­cesses that drive their decisions

14 Nov 2025 0:18 UTC
12 points
0 comments4 min readLW link