but if human intelligence and reasoning can be picked up from training, why would one expect values to be any different? the orthogonality thesis doesn’t make much sense to me either. my guess is that certain values are richer/more meaningful, and that more intelligent minds tend to be drawn to them.
and you can sort of see this with ASPD and NPD. they’re both correlated with lower non-verbal intelligence! and ASPD is correlated with significantly lower non-verbal intelligence.
and gifted children tend to have a much harder time with the problem of evil than less gifted children do! and if you look at domestication in animals, dogs and cats simultaneously evolved to be less aggressive and more intelligent at the same time.
but if human intelligence and reasoning can be picked up from training, why would one expect values to be any different? the orthogonality thesis doesn’t make much sense to me either. my guess is that certain values are richer/more meaningful, and that more intelligent minds tend to be drawn to them.
I think your first sentence here is correct, but not the last Like you can have smart people with bad motivations; super-smart octopuses might have different feelings about, idk, letting mothers die to care for their young, because that’s what they evolved from.
So I don’t think there’s any intrinsic reason to expect AIs to have good motivations apart from the data they’re trained on; the question is if such data gives you good reason for thinking that they have various motivations or not.
> my guess is that certain values are richer/more meaningful, and that more intelligent minds tend to be drawn to them.
I’m sympathetic to your position on value alignment vs intent alignment, but this feels very handwavy. In what sense are they richer (and what does “more meaningful” actually mean, concretely), and why would that cause intelligent minds to be drawn to them?
(Loose analogies to correlations you’ve observed in biological intelligences, which have their own specific origin stories, don’t seem like good evidence to me. And we have plenty of existence proofs for ‘smart + evil’, so there’s a limit to how far this line of argument could take us even in the best case.)
I think if one could formulate concepts like peace and wellbeing mathematically, and show that there were physical laws of the universe implying that eventually the total wellbeing in the universe grows monotonically positively then that could show that certain values are richer/“better” than others.
If you care about coherence then it seems like a universe full of aligned minds maximizes wellbeing while still being coherent. (This is because if you don’t care about coherence you could just make every mind infinitely joyful independent of the universe around it, which isn’t coherent).
but if human intelligence and reasoning can be picked up from training, why would one expect values to be any different? the orthogonality thesis doesn’t make much sense to me either. my guess is that certain values are richer/more meaningful, and that more intelligent minds tend to be drawn to them.
and you can sort of see this with ASPD and NPD. they’re both correlated with lower non-verbal intelligence! and ASPD is correlated with significantly lower non-verbal intelligence.
and gifted children tend to have a much harder time with the problem of evil than less gifted children do! and if you look at domestication in animals, dogs and cats simultaneously evolved to be less aggressive and more intelligent at the same time.
I think your first sentence here is correct, but not the last Like you can have smart people with bad motivations; super-smart octopuses might have different feelings about, idk, letting mothers die to care for their young, because that’s what they evolved from.
So I don’t think there’s any intrinsic reason to expect AIs to have good motivations apart from the data they’re trained on; the question is if such data gives you good reason for thinking that they have various motivations or not.
> my guess is that certain values are richer/more meaningful, and that more intelligent minds tend to be drawn to them.
I’m sympathetic to your position on value alignment vs intent alignment, but this feels very handwavy. In what sense are they richer (and what does “more meaningful” actually mean, concretely), and why would that cause intelligent minds to be drawn to them?
(Loose analogies to correlations you’ve observed in biological intelligences, which have their own specific origin stories, don’t seem like good evidence to me. And we have plenty of existence proofs for ‘smart + evil’, so there’s a limit to how far this line of argument could take us even in the best case.)
I think if one could formulate concepts like peace and wellbeing mathematically, and show that there were physical laws of the universe implying that eventually the total wellbeing in the universe grows monotonically positively then that could show that certain values are richer/“better” than others.
If you care about coherence then it seems like a universe full of aligned minds maximizes wellbeing while still being coherent. (This is because if you don’t care about coherence you could just make every mind infinitely joyful independent of the universe around it, which isn’t coherent).