CEV is (imo) a good concept for what we ultimately want and/or what humanity should try to achieve, but I’ve always found it hard to pithily talk about the intermediate world states to aim for now if we want CEV eventually.
I’ve heard the practical goal discussed as “a world from which we can be as sure as possible that we will achieve CEV”. Doesn’t really roll off the tongue. It would be nice to have a cleaner shorthand.
This also seems like the sort of thing Bostrom might have coined a term for fifteen years ago in some obscure paper.
I’d be interested in hearing any other terms or phrases that you think make talking about an intermediate goal state from which CEV is very likely (or as likely as possible) easier.
The two important conversations I’d like to be able to have are “what are the features of a realistic <state>?” and “how can we achieve <state>?” with participants having a shared understanding of what we’re talking about with <state>.
bostrom uses “existential security” to refer to this intermediate goal state IIRC—referring to a state where civilization is no longer facing significant risk of extinction or things like stable totalitarianism. this phrase connotes sort of a chill, minimum-viable utopia (just stop people from engineering super-smallpox and everything else stays the same, m’kay?), but I wonder if actual “existential security” might be essentially equivalent to locking in a very specific and as-yet-undiscovered form of governance conducive to suppressing certain dangerous technologies without falling into broader anti-tech stagnation, avoiding various dangers of totalitarianism and fanaticism, etc… https://forum.effectivealtruism.org/posts/NpYjajbCeLmjMRGvZ/human-empowerment-versus-the-longtermist-imperium
yudkowsky might have had a term (perhaps in his fun-theory sequence?) referring to a kind of intermediate utopia where humanity has covered “the basics” of things like existential security plus also some obvious moral goods like individual people no longer die + extreme suffering has been abolished + some basic level of intelligence enhancement for everybody + etc
some people talk about the “long reflection” which is similar to the concept of viatopia, albeit with more of a “pause everything” vibe that seems less practical for a bunch of reasons
it seems like it would be pretty useful for somebody to be thinking ahead about the detailed mechanics of different idealization processes (since maybe such processes do not “converge”, and doing things in a slightly different way / slightly different order might send you to very different ultimate destinations: https://joecarlsmith.com/2021/06/21/on-the-limits-of-idealized-values), even though this is probably not super tractable until it becomes clearer what kinds of “idealization technologies” will actually exist when, and what their possible uses will be (brain-computer interfaces, nootropic drugs or genetic enhancement procedures, AI advisors, “Jhourney”-esque spiritual-attainment-assistance technologies, improved collective decisionmaking technologies / institutions, etc)
CEV is not meant to depend on the state of human society. It is supposed to be derived from “human nature”, e.g. genetically determined needs, dispositions, norms and so forth, that are characteristic of our species as a whole. The quality of the extrapolation process is what matters, not the social initial conditions. You could be in “viatopia”, and if your extrapolation theory is wrong, the output will be wrong. Conversely, you could be in a severe dystopia, and so long as you have the biological facts and the extrapolation method correct, you’re supposed to arrive at the right answer.
I have previously made the related point that the outcome of CEV should not be different, whether you start with a saint or a sinner. So long as the person in question is normal Homo sapiens, that’s supposed to be enough.
Similarly, CEV is not supposed to be about identifying and reconciling all the random things that the people of the world may want at any given time. It is supposed to identify a value system or decision procedure which is the abstract kernel of how the smarter and better informed version of the human race would want important decisions to be made, regardless of the details of circumstance.
This is, I argue, all consistent with the original intent of CEV. The problem is that neither the relevant facts defining human nature, nor the extrapolation procedure, are known or specified with any rigor. If we look at the broader realm of possible Value Extrapolation Procedures, there are definitely some “VEPs” in which the outcome depends crucially on the state of society, the individuals who are your prototypes, and/or even the whims of those individuals at the moment of extrapolation.
Furthermore, it is likely that individual genotypic variation, and also the state of culture, really can affect the outcome, even if you have identified the “right” VEP. Culture can impact human nature significantly, and so can genetic variation.
I think it’s probably for the best that the original manifesto for CEV, was expressed in these idealistic terms—that it was about extrapolating a universal human nature. But if “CEV theory” is ever to get anywhere, it must be able to deal with all these concrete questions.
(For examples of CEV-like alignment proposals that include dependence on neurobiological facts, see PRISM and metaethical.ai.)
CEV, Idealized reflection, viatopia are all obvious and just a can-kicking circlejerk. Anti realism is true and this changes nothing. The religious undertones that there is some sort of convergent nirvana once you think hard enough is not true. Cosmopolitianism is only better than heroin tiled rats if you assume certain axioms. You should listen to smarter wiser people, duh. We all know this already. How is this profound when applied to normative ethics.
Agreed that the ideas are kind of obvious (from a certain rationalist perspective); nonetheless they are : 1. not widely known outside of rationalist circles, where most people might consider “utopia” to just mean some really mundane thing like “tax billionares enough to provide subsidized medicaid for all” rather than defeating death and achieving other assorted transhumanist treasures 2. potentially EXTREMELY important for the long-term future of civilization
In this regard they seem similar to the idea of existential risk, or the idea that AI might be a really important and pivotal technology—really really obvious in retrospect, yet underrated in broader societal discourse and potentially extremely important.
Unlike AI & x-risk, I think people who talk about CEV and viatopia have so far done an unimpressive job of exploring how those philosophical ideas about the far-future should be translated into relevant action today. (So many AI safety orgs, billion-dollar companies getting founded, government initiatives launched, lots of useful research and lobbying and etc getting done—there is no similar game plan for promoting “viatopia” as far as I know!)
”The religious undertones that there is some sort of convergent nirvana once you think hard enough is not true.”—can you argue for this in a convincing and detailed way? If so, that would be exciting—you would be contributing a very important step towards making concrete progress in thinking about CEV / etc, the exact tractability problem I was just complaining about!! But if you are just asserting a personal vibe without actual evidence or detailed arguments to back it up, then I’d not baldly assert ”...is not true”.
I also want to separately add that part of my frustration here (and the “can-kicking” part i mention) is that I worry this is just going to be weaponized as a reason to keep EA and LW glued together, even as obvious cracks develop. That would be fine—if we had a democracy—but we don’t. So at some point glue is a weapon for those in the community with de facto control to keep trudging forward without having to account for the increasing differences in moral views of those within.
Actually, they are extremely well known outside of rationalist circles. Many subgroups of the Jewish and Buddist faiths are pretty much built upon these principles. My parents told me “don’t put all your chips on the table” and ~keep optionality open. Some might even argue this is the core principle that has led to “democracy”. And yes as you rightly mentioned these are clearly foundational principles behind LW and EA. That’s why I use the strong language of “circlejerk”. This is really unnecessarily reinventing common english phrases. Viatopia perhaps gives the idea a bit of an action relevant flavor so I guess it extends a bit beyond the others, still not particularly new or insightful.
“can you argue for this in a convincing and detailed way?”
I mean the argument is so underpowered it’s hard to even know where to start. I actually don’t even think the concept is coherent tbf but I’ll try.
Assuming you are coming from the view that: you can take some sentient (or intelligent) being, and you keep the “essence” of that sentient being but make it smarter and give it more inference time, that then all sentient beings will start whipping, dabbing, and hitting the nae nae in synchronicity.
(Which I would say there is no coherent concept of self modification/enchancement that preserves the original essence so already meaningless but if I cast that aside. )
Then sure, take a sentient beings whose value function is completely determined. It can never change it’s mind, taughtologically. So it will never hit this convergent nirvana. it’s values are already fixed.
I must be confused because I don’t see how this could be any other way. And the funny thing is, even if i’m wrong about this, and somehow if you jack up the iq and inference to wazooh and the atoms start vibing out this still wouldn’t make their goals correct. You still haven’t solved the is ought problem.
CEV is (imo) a good concept for what we ultimately want and/or what humanity should try to achieve, but I’ve always found it hard to pithily talk about the intermediate world states to aim for now if we want CEV eventually.
I’ve heard the practical goal discussed as “a world from which we can be as sure as possible that we will achieve CEV”. Doesn’t really roll off the tongue. It would be nice to have a cleaner shorthand.
The term “viatopia” seems meant to capture the same idea: https://newsletter.forethought.org/p/viatopia
This also seems like the sort of thing Bostrom might have coined a term for fifteen years ago in some obscure paper.
I’d be interested in hearing any other terms or phrases that you think make talking about an intermediate goal state from which CEV is very likely (or as likely as possible) easier.
The two important conversations I’d like to be able to have are “what are the features of a realistic <state>?” and “how can we achieve <state>?” with participants having a shared understanding of what we’re talking about with <state>.
bostrom uses “existential security” to refer to this intermediate goal state IIRC—referring to a state where civilization is no longer facing significant risk of extinction or things like stable totalitarianism. this phrase connotes sort of a chill, minimum-viable utopia (just stop people from engineering super-smallpox and everything else stays the same, m’kay?), but I wonder if actual “existential security” might be essentially equivalent to locking in a very specific and as-yet-undiscovered form of governance conducive to suppressing certain dangerous technologies without falling into broader anti-tech stagnation, avoiding various dangers of totalitarianism and fanaticism, etc… https://forum.effectivealtruism.org/posts/NpYjajbCeLmjMRGvZ/human-empowerment-versus-the-longtermist-imperium
yudkowsky might have had a term (perhaps in his fun-theory sequence?) referring to a kind of intermediate utopia where humanity has covered “the basics” of things like existential security plus also some obvious moral goods like individual people no longer die + extreme suffering has been abolished + some basic level of intelligence enhancement for everybody + etc
some people talk about the “long reflection” which is similar to the concept of viatopia, albeit with more of a “pause everything” vibe that seems less practical for a bunch of reasons
it seems like it would be pretty useful for somebody to be thinking ahead about the detailed mechanics of different idealization processes (since maybe such processes do not “converge”, and doing things in a slightly different way / slightly different order might send you to very different ultimate destinations: https://joecarlsmith.com/2021/06/21/on-the-limits-of-idealized-values), even though this is probably not super tractable until it becomes clearer what kinds of “idealization technologies” will actually exist when, and what their possible uses will be (brain-computer interfaces, nootropic drugs or genetic enhancement procedures, AI advisors, “Jhourney”-esque spiritual-attainment-assistance technologies, improved collective decisionmaking technologies / institutions, etc)
CEV is not meant to depend on the state of human society. It is supposed to be derived from “human nature”, e.g. genetically determined needs, dispositions, norms and so forth, that are characteristic of our species as a whole. The quality of the extrapolation process is what matters, not the social initial conditions. You could be in “viatopia”, and if your extrapolation theory is wrong, the output will be wrong. Conversely, you could be in a severe dystopia, and so long as you have the biological facts and the extrapolation method correct, you’re supposed to arrive at the right answer.
I have previously made the related point that the outcome of CEV should not be different, whether you start with a saint or a sinner. So long as the person in question is normal Homo sapiens, that’s supposed to be enough.
Similarly, CEV is not supposed to be about identifying and reconciling all the random things that the people of the world may want at any given time. It is supposed to identify a value system or decision procedure which is the abstract kernel of how the smarter and better informed version of the human race would want important decisions to be made, regardless of the details of circumstance.
This is, I argue, all consistent with the original intent of CEV. The problem is that neither the relevant facts defining human nature, nor the extrapolation procedure, are known or specified with any rigor. If we look at the broader realm of possible Value Extrapolation Procedures, there are definitely some “VEPs” in which the outcome depends crucially on the state of society, the individuals who are your prototypes, and/or even the whims of those individuals at the moment of extrapolation.
Furthermore, it is likely that individual genotypic variation, and also the state of culture, really can affect the outcome, even if you have identified the “right” VEP. Culture can impact human nature significantly, and so can genetic variation.
I think it’s probably for the best that the original manifesto for CEV, was expressed in these idealistic terms—that it was about extrapolating a universal human nature. But if “CEV theory” is ever to get anywhere, it must be able to deal with all these concrete questions.
(For examples of CEV-like alignment proposals that include dependence on neurobiological facts, see PRISM and metaethical.ai.)
CEV, Idealized reflection, viatopia are all obvious and just a can-kicking circlejerk. Anti realism is true and this changes nothing. The religious undertones that there is some sort of convergent nirvana once you think hard enough is not true. Cosmopolitianism is only better than heroin tiled rats if you assume certain axioms. You should listen to smarter wiser people, duh. We all know this already. How is this profound when applied to normative ethics.
Agreed that the ideas are kind of obvious (from a certain rationalist perspective); nonetheless they are :
1. not widely known outside of rationalist circles, where most people might consider “utopia” to just mean some really mundane thing like “tax billionares enough to provide subsidized medicaid for all” rather than defeating death and achieving other assorted transhumanist treasures
2. potentially EXTREMELY important for the long-term future of civilization
In this regard they seem similar to the idea of existential risk, or the idea that AI might be a really important and pivotal technology—really really obvious in retrospect, yet underrated in broader societal discourse and potentially extremely important.
Unlike AI & x-risk, I think people who talk about CEV and viatopia have so far done an unimpressive job of exploring how those philosophical ideas about the far-future should be translated into relevant action today. (So many AI safety orgs, billion-dollar companies getting founded, government initiatives launched, lots of useful research and lobbying and etc getting done—there is no similar game plan for promoting “viatopia” as far as I know!)
”The religious undertones that there is some sort of convergent nirvana once you think hard enough is not true.”—can you argue for this in a convincing and detailed way? If so, that would be exciting—you would be contributing a very important step towards making concrete progress in thinking about CEV / etc, the exact tractability problem I was just complaining about!! But if you are just asserting a personal vibe without actual evidence or detailed arguments to back it up, then I’d not baldly assert ”...is not true”.
I also want to separately add that part of my frustration here (and the “can-kicking” part i mention) is that I worry this is just going to be weaponized as a reason to keep EA and LW glued together, even as obvious cracks develop. That would be fine—if we had a democracy—but we don’t. So at some point glue is a weapon for those in the community with de facto control to keep trudging forward without having to account for the increasing differences in moral views of those within.
Actually, they are extremely well known outside of rationalist circles. Many subgroups of the Jewish and Buddist faiths are pretty much built upon these principles. My parents told me “don’t put all your chips on the table” and ~keep optionality open. Some might even argue this is the core principle that has led to “democracy”. And yes as you rightly mentioned these are clearly foundational principles behind LW and EA. That’s why I use the strong language of “circlejerk”. This is really unnecessarily reinventing common english phrases. Viatopia perhaps gives the idea a bit of an action relevant flavor so I guess it extends a bit beyond the others, still not particularly new or insightful.
“can you argue for this in a convincing and detailed way?”
I mean the argument is so underpowered it’s hard to even know where to start. I actually don’t even think the concept is coherent tbf but I’ll try.
Assuming you are coming from the view that:
you can take some sentient (or intelligent) being, and you keep the “essence” of that sentient being but make it smarter and give it more inference time, that then all sentient beings will start whipping, dabbing, and hitting the nae nae in synchronicity.
(Which I would say there is no coherent concept of self modification/enchancement that preserves the original essence so already meaningless but if I cast that aside. )
Then sure, take a sentient beings whose value function is completely determined. It can never change it’s mind, taughtologically. So it will never hit this convergent nirvana. it’s values are already fixed.
I must be confused because I don’t see how this could be any other way. And the funny thing is, even if i’m wrong about this, and somehow if you jack up the iq and inference to wazooh and the atoms start vibing out this still wouldn’t make their goals correct. You still haven’t solved the is ought problem.