RSS

keith_wynroe

Karma: 323

Do Models Lie More to Other Models?

keith_wynroe28 May 2026 19:28 UTC
7 points
0 comments6 min readLW link

Asym­me­try Between Defen­sive and Ac­quisi­tive In­stru­men­tal Deception

keith_wynroe10 May 2026 12:33 UTC
17 points
1 comment5 min readLW link

Find­ing an Er­ror-De­tec­tion Fea­ture in Deep­Seek-R1

keith_wynroe24 Apr 2025 16:03 UTC
23 points
0 comments7 min readLW link

De­com­pos­ing the QK cir­cuit with Bilin­ear Sparse Dic­tionary Learning

2 Jul 2024 13:17 UTC
87 points
7 comments12 min readLW link