RSS

Bartosz Cywiński

Karma: 102

MATS 8.0 scholar with Arthur Conmy and Sam Marks

Cur­rent LLMs seem to rarely de­tect CoT tampering

19 Nov 2025 15:27 UTC
50 points
0 comments20 min readLW link

Elic­it­ing se­cret knowl­edge from lan­guage models

2 Oct 2025 20:57 UTC
68 points
3 comments2 min readLW link
(arxiv.org)