Really, what makes me go to the meta-level like this is pessimism about the more direct approach. Directly trying to instill human values, rather than first training in a meta-level understanding of that task, doesn’t seem like a very correctible approach.
True, but I’m also uncertain about the relative difficulty of relatively novel and exotic value-spreads like “I value doing the right thing by humans, where I’m uncertain about the referent of humans”, compared to “People should have lots of resources and be able to spend them freely and wisely in pursuit of their own purposes” (the latter being values that at least I do in fact have).
True, but I’m also uncertain about the relative difficulty of relatively novel and exotic value-spreads like “I value doing the right thing by humans, where I’m uncertain about the referent of humans”, compared to “People should have lots of resources and be able to spend them freely and wisely in pursuit of their own purposes” (the latter being values that at least I do in fact have).