Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
SciHamster comments on
Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs
SciHamster
28 Feb 2025 22:39 UTC
3
points
2
fwiw, the fact that somebody can just finetune the model, is already indicative of a serious problem
Back to top
fwiw, the fact that somebody can just finetune the model, is already indicative of a serious problem