Shard theory values are terminal for the being that has them — but if that being is evolved, they’re almost always values that would be instrumental if you had the terminal value of maximizing evolutionary fitness in the creature’s native environment. So from “evolution’s point of view”, they’re “instrumental values of the terminal value: maximize the creature’s evolutionary fitness”.
Some concrete examples: maintain blood levels of water, salt, glucose, and a few other basics within certain bounds. Keep body temperature within a narrow band, without excessive metabolic cost. Get enough sleep. Have flowers around. Have some trees around (preferably climbable ones) but not too many. Get to see healthily young adult members of the (normally opposite) gender happy and not-very clad. Other members of the tribe seem to like you and are happy to help you when you need help. There’s a whole long list.
Biologically, almost all of this seems to be implemented in the older parts of the brain: the brainstem and all the little fiddly bits. I.e. the parts that look like they’re probably a lot of smallish custom circuits with a lot of genetic control of the specifics of them. Human values are moderately complex, but one description of them fits in ~4GB of DNA.
(I believe the etymology of “shard theory” is from evolution’s godshatter — a term originally taken from Vernor Vinge and repurposed, I think by MIRI.)
Thanks, and yes evolution is the source of many values for sure...I think the terminal vs instrumental question leads in interesting directions. Please let me know how this sits with you!
Though I am an evolved being, none of your examples seem to be terminal values for me the whole organism. Certainly there are many systems within me, and perhaps we could describe them as having their own terminal values, which in part come from evolution as you describe. My metabolic system’s terminal value surely has a lot to do with regulating glucose. My reproductive system’s terminal value likely involves sex/procreation. (But maybe even these can drift, like when a cell becomes cancerous, it seems its terminal value changes.)
But to me as a whole, these values (to the extent which I hold them at all) are instrumental. Sure I want homeostasis, but I want it because I want to live (another instrumental value), and I want to live because I want to be able to pursue my terminal value of happiness/flourishing. Other values that my parts exhibit (like reproduction) I the whole might reject even as an instrumental value, heck I might even subvert the mechanisms afforded by my reproductive system for my own happiness/flourishing.
Also for my terminal value for happiness/flourishing, did that come from evolution? Did it start out as survival/reproduction and drift a bit? Or is there something special about systems like me (which are conscious of pleasure/pain/etc) that just by their nature they desire happiness/flourishing, the way 2+2=4 or the way a triangle has 3 sides? Or...other?
And lastly does any of this port to non-evolved beings like AIs?
I’ve though about this some more and I think what you mean (leaving aside physical and homeostatic values and focusing on organism-wide values) is that, even if we define our “terminal value” as I have above, whence the basket of goods that mean “happiness/flourishing” to me?
Again I think the answer is evolution plus something...some value drift (that as you say, the Shard Theory people are trying to figure out). Is there a place/post you’d recommend to get up to speed on that? The wikitag is a little light on details (although I added a sequence that was a good starting place). https://www.lesswrong.com/w/shard-theory
I’d suggest TurnTrout’s writing (Alex Turner at DeepMind), since he’s the person who first came up with the idea. Most of his posts are on LessWrong/The Aligment Forum, but they’re best organized on his own website. I’d suggest starting at https://turntrout.com/research, reading the section on Shard Theory, and following links.
He himself admits that some of his key posts often seem to get misunderstood: I think they repay careful reading and some thought.
Shard theory values are terminal for the being that has them — but if that being is evolved, they’re almost always values that would be instrumental if you had the terminal value of maximizing evolutionary fitness in the creature’s native environment. So from “evolution’s point of view”, they’re “instrumental values of the terminal value: maximize the creature’s evolutionary fitness”.
Some concrete examples: maintain blood levels of water, salt, glucose, and a few other basics within certain bounds. Keep body temperature within a narrow band, without excessive metabolic cost. Get enough sleep. Have flowers around. Have some trees around (preferably climbable ones) but not too many. Get to see healthily young adult members of the (normally opposite) gender happy and not-very clad. Other members of the tribe seem to like you and are happy to help you when you need help. There’s a whole long list.
Biologically, almost all of this seems to be implemented in the older parts of the brain: the brainstem and all the little fiddly bits. I.e. the parts that look like they’re probably a lot of smallish custom circuits with a lot of genetic control of the specifics of them. Human values are moderately complex, but one description of them fits in ~4GB of DNA.
(I believe the etymology of “shard theory” is from evolution’s godshatter — a term originally taken from Vernor Vinge and repurposed, I think by MIRI.)
Thanks, and yes evolution is the source of many values for sure...I think the terminal vs instrumental question leads in interesting directions. Please let me know how this sits with you!
Though I am an evolved being, none of your examples seem to be terminal values for me the whole organism. Certainly there are many systems within me, and perhaps we could describe them as having their own terminal values, which in part come from evolution as you describe. My metabolic system’s terminal value surely has a lot to do with regulating glucose. My reproductive system’s terminal value likely involves sex/procreation. (But maybe even these can drift, like when a cell becomes cancerous, it seems its terminal value changes.)
But to me as a whole, these values (to the extent which I hold them at all) are instrumental. Sure I want homeostasis, but I want it because I want to live (another instrumental value), and I want to live because I want to be able to pursue my terminal value of happiness/flourishing. Other values that my parts exhibit (like reproduction) I the whole might reject even as an instrumental value, heck I might even subvert the mechanisms afforded by my reproductive system for my own happiness/flourishing.
Also for my terminal value for happiness/flourishing, did that come from evolution? Did it start out as survival/reproduction and drift a bit? Or is there something special about systems like me (which are conscious of pleasure/pain/etc) that just by their nature they desire happiness/flourishing, the way 2+2=4 or the way a triangle has 3 sides? Or...other?
And lastly does any of this port to non-evolved beings like AIs?
That’s what the people working on Shard Theory are trying to find out.
I’ve though about this some more and I think what you mean (leaving aside physical and homeostatic values and focusing on organism-wide values) is that, even if we define our “terminal value” as I have above, whence the basket of goods that mean “happiness/flourishing” to me?
Again I think the answer is evolution plus something...some value drift (that as you say, the Shard Theory people are trying to figure out). Is there a place/post you’d recommend to get up to speed on that? The wikitag is a little light on details (although I added a sequence that was a good starting place). https://www.lesswrong.com/w/shard-theory
I’d suggest TurnTrout’s writing (Alex Turner at DeepMind), since he’s the person who first came up with the idea. Most of his posts are on LessWrong/The Aligment Forum, but they’re best organized on his own website. I’d suggest starting at https://turntrout.com/research, reading the section on Shard Theory, and following links.
He himself admits that some of his key posts often seem to get misunderstood: I think they repay careful reading and some thought.