Do you have particular other hypotheses for what’s going on here that’s different from “evolution has selected for us having a pretty strong drive to do X in situation Y”? Do you dispute that most humans tend towards X in Y situations?
I do have other hypotheses, but this margin is too small to detail them. However, even if I didn’t have other hypotheses, it’s sometimes important and healthy to maintain uncertainty even in the absence of being able to come up with concrete other hypotheses for your observations.
Just because “Evolution hardcoded it” is the first explanation we can think of, and just because there weren’t other obvious explanations, this only slightly increases my credence in “Evolution hardcoded it”, because people not finding other theories yet is pretty mild evidence, all things considered.
“Evolution has selected for us having a pretty strong drive to do X in situation Y” is not actually a mechanistic explanation at all.
When weighing “is it hardcoded, or not?”, we are considering a modern genome, which specifies the human brain. Given the genome, evolution’s influence on human values is screened off. The question is: What solution did evolution find?
Also, you might have considered this, but I want to make the point here: Evolution is not directly selecting over high-level cognitive properties, like the motivation to settle down (X) at age twenty-three (situation Y). Evolution selects over genes, and those mutations affect the adult brain’s high-level properties eventually.
I agree it’s useful to distinguish inference from observation but also think there are plenty of times the inference is pretty reasonable and accepted, and this feels kinda isolated demand for rigor-y.
I see why you think that. I think it’s unfortunately accepted, but it’s not reasonable, and that it is quite wrong and contradictory. See Human values & biases are inaccessible to the genome for the in-depth post.
Also, I think people are too used to saying whatever they want about hardcoding, since it’s hard to get caught out by reality because it’s hard to make the brain transparent and thereby get direct evidence on the question. Human value formation is an incredibly important topic where we really should exercise care, and, to be frank, I think most people (myself in 2021 included) do not exercise much care. (This is not meant as social condemnation of you in particular, to be clear.)
I have some sense of you having a general contrarian take on a bunch of stuff closely related to this topic
Yup. Here’s another hot take:
We observe that most people want to punish people who fuck them over. Therefore, evolution directly hardcoded punishment impulses.
We observe that GPT-3 sucks at parity checking. Therefore, the designers directly hardcoded machinery to ensure it sucks at parity checking.
We observe an AI seeking power. Therefore, the designers directly hardcoded the AI to value power.
I think these arguments are comparably locally valid. That is to say, not very.
“can you argue more clearly about what you think is going on with human tendency towards X in Y situation?”
I will eventually, but I maintain that it’s valid to say “I don’t know what’s happening, but we can’t be confident in hardcoding given the available evidence.”
I think I did mean something less specific by “hardcoded” than you interpreted me to mean (your post notes that hardwired responses to sense-stimuli are more plausible than hardwired responses to abstract concepts, and I didn’t have a strong take that any of the stuff I talk about in this post required abstract concepts.
But I also indeed hadn’t reflected on the plausibility of various mechanisms here at all, and you’ve given me a lot of food for thought. I’ll probably weigh in with more thoughts on your other post.
First of all, sorry for picking out a few remarks which weren’t meant super strongly.
That said,
See Human values & biases are inaccessible to the genome.
I do have other hypotheses, but this margin is too small to detail them. However, even if I didn’t have other hypotheses, it’s sometimes important and healthy to maintain uncertainty even in the absence of being able to come up with concrete other hypotheses for your observations.
Just because “Evolution hardcoded it” is the first explanation we can think of, and just because there weren’t other obvious explanations, this only slightly increases my credence in “Evolution hardcoded it”, because people not finding other theories yet is pretty mild evidence, all things considered.
“Evolution has selected for us having a pretty strong drive to do X in situation Y” is not actually a mechanistic explanation at all.
When weighing “is it hardcoded, or not?”, we are considering a modern genome, which specifies the human brain. Given the genome, evolution’s influence on human values is screened off. The question is: What solution did evolution find?
Also, you might have considered this, but I want to make the point here: Evolution is not directly selecting over high-level cognitive properties, like the motivation to settle down (X) at age twenty-three (situation Y). Evolution selects over genes, and those mutations affect the adult brain’s high-level properties eventually.
I see why you think that. I think it’s unfortunately accepted, but it’s not reasonable, and that it is quite wrong and contradictory. See Human values & biases are inaccessible to the genome for the in-depth post.
Also, I think people are too used to saying whatever they want about hardcoding, since it’s hard to get caught out by reality because it’s hard to make the brain transparent and thereby get direct evidence on the question. Human value formation is an incredibly important topic where we really should exercise care, and, to be frank, I think most people (myself in 2021 included) do not exercise much care. (This is not meant as social condemnation of you in particular, to be clear.)
Yup. Here’s another hot take:
We observe that most people want to punish people who fuck them over. Therefore, evolution directly hardcoded punishment impulses.
We observe that GPT-3 sucks at parity checking. Therefore, the designers directly hardcoded machinery to ensure it sucks at parity checking.
We observe an AI seeking power. Therefore, the designers directly hardcoded the AI to value power.
I think these arguments are comparably locally valid. That is to say, not very.
I will eventually, but I maintain that it’s valid to say “I don’t know what’s happening, but we can’t be confident in hardcoding given the available evidence.”
Thanks. This all does seem reasonable.
I think I did mean something less specific by “hardcoded” than you interpreted me to mean (your post notes that hardwired responses to sense-stimuli are more plausible than hardwired responses to abstract concepts, and I didn’t have a strong take that any of the stuff I talk about in this post required abstract concepts.
But I also indeed hadn’t reflected on the plausibility of various mechanisms here at all, and you’ve given me a lot of food for thought. I’ll probably weigh in with more thoughts on your other post.