Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Cole Wyeth comments on
Finding “misaligned persona” features in open-weight models
Cole Wyeth
9 Sep 2025 14:40 UTC
0
points
0
I’ve seen one too many cartoons of a nice angelic “aligned” AGI.
We don’t know how to build that.
Back to top
I’ve seen one too many cartoons of a nice angelic “aligned” AGI.
We don’t know how to build that.