This is a sequence investigating the feasibility of one approach to AI alignment: value learning.
Preface to the sequence on value learning
1. Ambitious Value Learning
What is ambitious value learning?
The easy goal inference problem is still hard
Humans can be assigned any values whatsoever…
Latent Variables and Model Mis-Specification
Model Mis-specification and Inverse Reinforcement Learning
Future directions for ambitious value learning
2. Goals vs Utility Functions
Ambitious value learning aims to give the AI the correct utility function to avoid catastrophe. Given its difficulty, we revisit the arguments for utility functions in the first place.