RSS

Nevan Wichers

Karma: 178

Model Spec Mid­train­ing: Im­prov­ing How Align­ment Train­ing Generalizes

5 May 2026 21:55 UTC
71 points
7 comments7 min readLW link
(alignment.anthropic.com)

Inoc­u­la­tion prompt­ing: In­struct­ing mod­els to mis­be­have at train-time can im­prove run-time behavior

8 Oct 2025 22:02 UTC
176 points
37 comments2 min readLW link

Vi­su­al­iz­ing neu­ral net­work planning

9 May 2024 6:40 UTC
4 points
0 comments5 min readLW link

A Var­i­ance In­differ­ent Max­i­mizer Alternative

Nevan Wichers13 Feb 2020 9:06 UTC
7 points
1 comment4 min readLW link