RSS

robert mccarthy

Karma: 109

Open-weight train­ing prac­tices and im­pli­ca­tions for CoT monitorability

4 Nov 2025 10:49 UTC
15 points
0 comments9 min readLW link

Early Signs of Stegano­graphic Ca­pa­bil­ities in Fron­tier LLMs

4 Jul 2025 16:36 UTC
31 points
5 comments2 min readLW link

Can LLMs learn Stegano­graphic Rea­son­ing via RL?

11 Apr 2025 16:33 UTC
29 points
3 comments6 min readLW link

[Paper] Hid­den in Plain Text: Emer­gence and Miti­ga­tion of Stegano­graphic Col­lu­sion in LLMs

25 Sep 2024 14:52 UTC
37 points
2 comments4 min readLW link
(arxiv.org)