adamShimi comments on General alignment plus human values, or alignment via human values?

adamShimi 23 Oct 2021 7:38 UTC
LW: 5 AF: 3
0
AF
I’m with Steve on the idea that there’s a difference between broad human preferences (something like common sense?) and particular and exact human preferences (what would be needed for ambitious value learning).
Still, you (Stuart) made me realize that I didn’t think explicitly about this need for broad human preferences in my splitting of the problem (be able to align, then point to what we want), but it’s indeed implicit because I don’t care about being able to do “anything”, just the sort of things humans might want.