RSS

lukemarks

Karma: 716

[Paper] Out­put Su­per­vi­sion Can Obfus­cate the CoT

20 Nov 2025 22:41 UTC
75 points
2 comments5 min readLW link
(arxiv.org)

[Re­search Note] Op­ti­miz­ing The Fi­nal Out­put Can Obfus­cate CoT

30 Jul 2025 21:26 UTC
199 points
23 comments6 min readLW link

Steer­ing Lan­guage Models in Mul­ti­ple Direc­tions Simultaneously

2 May 2025 15:27 UTC
18 points
0 comments7 min readLW link

luke­marks’s Shortform

lukemarks2 Jul 2024 6:56 UTC
4 points
19 comments1 min readLW link