Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Lorec comments on
AI models inherently alter “human values.” So, alignment-based AI safety approaches must better account for value drift
Lorec
14 Jan 2025 16:34 UTC
1
point
0
Are you familiar with
Yudkowsky’s/Miles’s/Christiano’s
AI Corrigibility
concept?
Back to top
Are you familiar with Yudkowsky’s/Miles’s/Christiano’s AI Corrigibility concept?