RSS

robert mccarthy

Karma: 179

Rea­son­ing Models Strug­gle to Con­trol Their Chains of Thought

5 Mar 2026 22:37 UTC
74 points
8 comments3 min readLW link

Open-weight train­ing prac­tices and im­pli­ca­tions for CoT monitorability

4 Nov 2025 10:49 UTC
15 points
0 comments9 min readLW link

Early Signs of Stegano­graphic Ca­pa­bil­ities in Fron­tier LLMs

4 Jul 2025 16:36 UTC
33 points
5 comments2 min readLW link

Can LLMs learn Stegano­graphic Rea­son­ing via RL?

11 Apr 2025 16:33 UTC
30 points
3 comments6 min readLW link

[Paper] Hid­den in Plain Text: Emer­gence and Miti­ga­tion of Stegano­graphic Col­lu­sion in LLMs

25 Sep 2024 14:52 UTC
37 points
2 comments4 min readLW link
(arxiv.org)