I think more correct picture is that it’s useful to have programmable behavior and then programmable system suddenly becomes Turing-complete weird machine and some of resulting programs are terminal-goal-oriented, which are favored by selection pressures: terminal goals are self-preserving.
Humans in native enviornment have programmable behavior in form of social regulation, information exchange and communicating instructions, if you add sufficient amount of computational power in this system you can get very wide spectrum of behaviors.
I think it’s general picture of inner misalignment.
That seems like part of the picture, but far from all of it. Manufactured stone tools have been around for well over 2 million years. That’s the sort of thing you do when you already have a significant amount of “hold weeks-long goal in mind long and strong enough that you put in a couple day’s effort towards it” (or something like that). Another example is Richard Alexander’s hypothesis: warfare --> strong pressure toward cognitive mechanisms for group-goal-construction. Neither of these are mainly about programmability (though the latter is maybe somewhat). I don’t think we see “random self-preserving terminal goals installed exogenously”, I think we see goals being self-constructed and then flung into long-termness.
I think more correct picture is that it’s useful to have programmable behavior and then programmable system suddenly becomes Turing-complete weird machine and some of resulting programs are terminal-goal-oriented, which are favored by selection pressures: terminal goals are self-preserving.
Humans in native enviornment have programmable behavior in form of social regulation, information exchange and communicating instructions, if you add sufficient amount of computational power in this system you can get very wide spectrum of behaviors.
I think it’s general picture of inner misalignment.
That seems like part of the picture, but far from all of it. Manufactured stone tools have been around for well over 2 million years. That’s the sort of thing you do when you already have a significant amount of “hold weeks-long goal in mind long and strong enough that you put in a couple day’s effort towards it” (or something like that). Another example is Richard Alexander’s hypothesis: warfare --> strong pressure toward cognitive mechanisms for group-goal-construction. Neither of these are mainly about programmability (though the latter is maybe somewhat). I don’t think we see “random self-preserving terminal goals installed exogenously”, I think we see goals being self-constructed and then flung into long-termness.