RSS

Wuschel Schulz

Karma: 368

A Con­crete Roadmap to­wards Safety Cases based on Chain-of-Thought Monitoring

Wuschel Schulz23 Oct 2025 11:34 UTC
37 points
5 comments4 min readLW link
(arxiv.org)

[Paper] Au­to­mated Fea­ture La­bel­ing with To­ken-Space Gra­di­ent Descent

Wuschel Schulz30 Apr 2025 10:22 UTC
4 points
0 comments4 min readLW link

A short ‘deriva­tion’ of Watan­abe’s Free En­ergy Formula

Wuschel Schulz29 Jan 2024 23:41 UTC
13 points
6 comments7 min readLW link

Steer­ing Llama-2 with con­trastive ac­ti­va­tion additions

2 Jan 2024 0:47 UTC
125 points
29 comments8 min readLW link
(arxiv.org)