but if human intelligence and reasoning can be picked up from training, why would one expect values to be any different? the orthogonality thesis doesn’t make much sense to me either. my guess is that certain values are richer/more meaningful, and that more intelligent minds tend to be drawn to them.
I think your first sentence here is correct, but not the last Like you can have smart people with bad motivations; super-smart octopuses might have different feelings about, idk, letting mothers die to care for their young, because that’s what they evolved from.
So I don’t think there’s any intrinsic reason to expect AIs to have good motivations apart from the data they’re trained on; the question is if such data gives you good reason for thinking that they have various motivations or not.
I think your first sentence here is correct, but not the last Like you can have smart people with bad motivations; super-smart octopuses might have different feelings about, idk, letting mothers die to care for their young, because that’s what they evolved from.
So I don’t think there’s any intrinsic reason to expect AIs to have good motivations apart from the data they’re trained on; the question is if such data gives you good reason for thinking that they have various motivations or not.