Viliam comments on wingspan’s Shortform

Viliam 9 Oct 2025 8:46 UTC
2 points
2
we might come up with a formal definition to what an “aligned agent” means
I believe this is the difficult part. More precisely, to describe what is the agent aligned with. We can start with treating human CEV as a black box, and specify mathematically what does it mean to be aligned with some abstract f(x). But then the problem will be to specify f(x).