In some sense that’s a direction I might be moving in with my thinking, but there is still some thing that humans identify as values that they care about, so I expect there to be some real phenomenon going on that needs to be considered to get good outcomes, since I expect the default remains a bad outcome if we don’t pay attention to whatever it is that makes humans care about stuff. I expect most work today on value learning is not going to get us where we want to go because it’s working with the wrong abstractions, and my goal in this work is to dissolve those abstractions to find better ones for our long-term purposes.
In some sense that’s a direction I might be moving in with my thinking, but there is still some thing that humans identify as values that they care about, so I expect there to be some real phenomenon going on that needs to be considered to get good outcomes, since I expect the default remains a bad outcome if we don’t pay attention to whatever it is that makes humans care about stuff. I expect most work today on value learning is not going to get us where we want to go because it’s working with the wrong abstractions, and my goal in this work is to dissolve those abstractions to find better ones for our long-term purposes.