RSS

Ben Millwood

Karma: 122

White Box Con­trol at UK AISI—Up­date on Sand­bag­ging Investigations

10 Jul 2025 13:37 UTC
77 points
10 comments18 min readLW link

[Question] Should we ex­clude al­ign­ment re­search from LLM train­ing datasets?

Ben Millwood18 Jul 2024 10:27 UTC
3 points
5 comments1 min readLW link

Keep­ing con­tent out of LLM train­ing datasets

Ben Millwood18 Jul 2024 10:27 UTC
4 points
0 comments5 min readLW link