RSS

Wuschel Schulz

Karma: 290

A short ‘deriva­tion’ of Watan­abe’s Free En­ergy Formula

Wuschel Schulz29 Jan 2024 23:41 UTC
13 points
6 comments7 min readLW link

Steer­ing Llama-2 with con­trastive ac­ti­va­tion additions

2 Jan 2024 0:47 UTC
123 points
29 comments8 min readLW link
(arxiv.org)

Si­mu­la­tors In­crease the Like­li­hood of Align­ment by Default

Wuschel Schulz30 Apr 2023 16:32 UTC
13 points
1 comment5 min readLW link

If Went­worth is right about nat­u­ral ab­strac­tions, it would be bad for alignment

Wuschel Schulz8 Dec 2022 15:19 UTC
29 points
5 comments4 min readLW link

A caveat to the Orthog­o­nal­ity Thesis

Wuschel Schulz9 Nov 2022 15:06 UTC
38 points
10 comments2 min readLW link