Putting this in a separate comment, because Reign of Terror moderation scares me and I want to compartmentalize. I am still unclear about the following things:
Why do we think memetic evolution will produce complex/powerful results? It seems like the mutation rate is much, much higher than biological evolution.
Valentine describes these memes as superintelligences, as “noticing” things, and generally being agents. Are these superintelligences hosted per-instance-of-meme, with many stuffed into each human? Or is something like “QAnon” kind of a distributed intelligence, doing its “thinking” through social interactions? Both of these models seem to have some problems (power/speed), so maybe something else?
Misaligned (digital) AGI doesn’t seem like it’ll be a manifestation of some existing meme and therefore misaligned, it seems more like it’ll just be some new misaligned agent. There is no highly viral meme going around right now about producing tons of paperclips.
I really appreciate your list of claims and unclear points. Your succinct summary is helping me think about these ideas.
There is no highly viral meme going around right now about producing tons of paperclips.
A few examples came to mind: sports paraphernalia, tabletop miniatures, and stuffed animals (which likely outnumber real animals by hundreds or thousands of times).
One might argue that these things give humans joy, so they don’t count. There is some validity to that. AI paperclips are supposed to be useless to humans. On the other hand, one might also argue that it is unsurprising that subsystems repurposed to seek out paperclips derive some ‘enjoyment’ from the paperclips… but I don’t think that argument will hold water for these examples. Looking at it another way, some amount of paperclips are indeed useful.
No egregore has turned the entire world to paperclips just yet. But of course that hasn’t happened, else we would have already lost.
Even so: consider paperwork (like the tax forms mentioned in the post), skill certifications in the workplace, and things like slot machines and reality television. A lot of human effort is wasted on things humans don’t directly care about, for non-obvious reasons. Those things could be paperclips.
(And perhaps some humans derive genuine joy out of reality television, paperwork, or giant piles of paperclips. I don’t think that changes my point that there is evidence of egregores wasting resources.)
I think the point under contention isn’t whether current egregores are (in some sense) “optimizing” for things that would score poorly according to human values (they are), but whether the things they’re optimizing for have some (clear, substantive) relation to the things a misaligned AGI will end up optimizing for, such that an intervention on the whole egregores situation would have a substantial probability of impacting the eventual AGI.
To this question I think the answer is a fairly clear “no”, though of course this doesn’t invalidate the possibility that investigating how to deal with egregores may result in some non-trivial insights for the alignment problem.
Why do we think memetic evolution will produce complex/powerful results? It seems like the mutation rate is much, much higher than biological evolution.
Doesn’t the second part answer the first? I mean, the reason biological evolution matters is because its mutation rate massively outstrips geological and astronomical shifts. Memetic evolution dominates biological evolution for the same reason.
Also, just empirically: memetic evolution produced civilization, social movements, Crusades, the Nazis, etc.
I wonder if I’m just missing your question.
Are these superintelligences hosted per-instance-of-meme, with many stuffed into each human? Or is something like “QAnon” kind of a distributed intelligence, doing its “thinking” through social interactions?
Both.
I wonder if you’re both (a) blurring levels and (b) intuitively viewing these superintelligences as having some kind of essence that either is or isn’t in someone.
What is or isn’t a “meme” isn’t well defined. A catch phrase (e.g. “Black lives matter!”) is totally a meme. But is a religion a meme? Is it more like a collection of memes? If so, what exactly are its constituent memes? And with catch phrases, most of them can’t survive without a larger memetic context. (Try getting “Black lives matter!” to spread through an isolated Amazonian tribe.) So should we count the larger memetic context as part of the meme?
But if you stop trying to ask what is or isn’t a meme and you just look at the phenomenon, you can see something happening. In the BLM movement, the phrase “Silence is violence” evolved and spread because it was evocative and helped the whole movement combat opposition in a way that supported its egregoric possession.
So… where does the whole BLM superorganism live? In its believers and supporters, sure. But also in its opponents. (Think of how folk who opposed BLM would spread its claims in order to object to them.) Also on webpages. Billboards. Now in Hollywood movies. And it’s always shifting and mutating.
The academic field of memetics died because they couldn’t formally define “meme”. But that’s backwards. Biology didn’t need to formally define life to recognize that there’s something to study. The act of studying seems to make some definitions more possible.
That’s where we’re at right now. Egregoric zoology, post Darwin but pre Watson & Crick.
Misaligned (digital) AGI doesn’t seem like it’ll be a manifestation of some existing meme and therefore misaligned, it seems more like it’ll just be some new misaligned agent. There is no highly viral meme going around right now about producing tons of paperclips.
I quite agree. I didn’t mean to imply otherwise.
The thing is, unFriendly hypercreatures aren’t thinking about aligning AI to hypercreatures either. They have very little foresight.
(This is an artifact of how most unFriendly egregores do their thing via stupefaction. Most possessed people can’t think about the future because it’s too real and involves things like their personal death. They instead think about symbolic futures and get sideswiped when reality predictably doesn’t go according to their plans. So since unFriendly hypercreatures use stupefied minds to plan, they end up having trouble with long futures, ergo unable to sanely orient to real-world issues that in fact screw them over.)
I think these hypercreatures will get just as shocked as the rest of us when AGI comes online.
The thing is, the pathway by which something like AGI actually destroys us is some combo of (a) getting a hold of real-world systems like nukes and (b) hacking human minds to do its bidding. Both of these are already happening via unFriendly hypercreature evolution, and for exactly the same reasons that folk are fearing AI risk.
The creation of digital AGI just finishes moving the substrate off of humans, at which point the emergent unFriendly superintelligence no longer has any reason to care about human bodies or minds. At that point we lose all leverage.
That’s why I’m looking at the current situation and saying “Hey guys, I think you’re missing what’s actually happening here. We’re already in AI takeoff, and you’re fixated on the moment we lose all control instead of on this moment where we still have some.”
I think of the step to AGI as the final one, when some egregore figures out how to build a memetic nuke but doesn’t realize it’ll burn everything.
So, no magical meme transforming into a digital form.
(Although it’s some company or whatever that will specify to the AGI “Make paperclips” or whatever. God forbid some corporate egregore builds an AGI to “maximize profit”.)
Memetic evolution dominates biological evolution for the same reason.
Faster mutation rate doesn’t just produce faster evolution—it also reduces the steady-state fitness. Complex machinery can’t reliably be evolved if pieces of it are breaking all the time. I’m mostly relying No Evolutions for Corporations or Nanodevices plus one undergrad course in evolutionary bio here.
Also, just empirically: memetic evolution produced civilization, social movements, Crusades, the Nazis, etc.
Thank you for pointing this out. I agree with the empirical observation that we’ve had some very virulent and impactful memes. I’m skeptical about saying that those were produced by evolution rather than something more like genetic drift, because of the mutation-rate argument. But given that observation, I don’t know if it matters if there’s evolution going on or not. What we’re concerned with is the impact, not the mechanism.
I think at this point I’m mostly just objecting to the aesthetic and some less-rigorous claims that aren’t really important, not the core of what you’re arguing. Does it just come down to something like:
“Ideas can be highly infectious and strongly affect behavior. Before you do anything, check for ideas in your head which affect your behavior in ways you don’t like. And before you try and tackle a global-scale problem with a small-scale effort, see if you can get an idea out into the world to get help.”
Putting this in a separate comment, because Reign of Terror moderation scares me and I want to compartmentalize. I am still unclear about the following things:
Why do we think memetic evolution will produce complex/powerful results? It seems like the mutation rate is much, much higher than biological evolution.
Valentine describes these memes as superintelligences, as “noticing” things, and generally being agents. Are these superintelligences hosted per-instance-of-meme, with many stuffed into each human? Or is something like “QAnon” kind of a distributed intelligence, doing its “thinking” through social interactions? Both of these models seem to have some problems (power/speed), so maybe something else?
Misaligned (digital) AGI doesn’t seem like it’ll be a manifestation of some existing meme and therefore misaligned, it seems more like it’ll just be some new misaligned agent. There is no highly viral meme going around right now about producing tons of paperclips.
I really appreciate your list of claims and unclear points. Your succinct summary is helping me think about these ideas.
A few examples came to mind: sports paraphernalia, tabletop miniatures, and stuffed animals (which likely outnumber real animals by hundreds or thousands of times).
One might argue that these things give humans joy, so they don’t count. There is some validity to that. AI paperclips are supposed to be useless to humans. On the other hand, one might also argue that it is unsurprising that subsystems repurposed to seek out paperclips derive some ‘enjoyment’ from the paperclips… but I don’t think that argument will hold water for these examples. Looking at it another way, some amount of paperclips are indeed useful.
No egregore has turned the entire world to paperclips just yet. But of course that hasn’t happened, else we would have already lost.
Even so: consider paperwork (like the tax forms mentioned in the post), skill certifications in the workplace, and things like slot machines and reality television. A lot of human effort is wasted on things humans don’t directly care about, for non-obvious reasons. Those things could be paperclips.
(And perhaps some humans derive genuine joy out of reality television, paperwork, or giant piles of paperclips. I don’t think that changes my point that there is evidence of egregores wasting resources.)
I think the point under contention isn’t whether current egregores are (in some sense) “optimizing” for things that would score poorly according to human values (they are), but whether the things they’re optimizing for have some (clear, substantive) relation to the things a misaligned AGI will end up optimizing for, such that an intervention on the whole egregores situation would have a substantial probability of impacting the eventual AGI.
To this question I think the answer is a fairly clear “no”, though of course this doesn’t invalidate the possibility that investigating how to deal with egregores may result in some non-trivial insights for the alignment problem.
I agree with you.
I also don’t think it matters whether the AGI will optimize for something current egregores care about.
What matters is whether current egregores will in fact create AGI.
The fear around AI risk is that the answer is “inevitably yes”.
The current egregores are actually no better at making AGI egregore-aligned than humans are at making it human-aligned.
But they’re a hell of a lot better at making AGI accidentally, and probably at all.
So if we don’t sort out how to align egregores, we’re fucked — and so are the egregores.
I think I see what you mean. A new AI won’t be under the control of egregores. It will be misaligned to them as well. That makes sense.
Doesn’t the second part answer the first? I mean, the reason biological evolution matters is because its mutation rate massively outstrips geological and astronomical shifts. Memetic evolution dominates biological evolution for the same reason.
Also, just empirically: memetic evolution produced civilization, social movements, Crusades, the Nazis, etc.
I wonder if I’m just missing your question.
Both.
I wonder if you’re both (a) blurring levels and (b) intuitively viewing these superintelligences as having some kind of essence that either is or isn’t in someone.
What is or isn’t a “meme” isn’t well defined. A catch phrase (e.g. “Black lives matter!”) is totally a meme. But is a religion a meme? Is it more like a collection of memes? If so, what exactly are its constituent memes? And with catch phrases, most of them can’t survive without a larger memetic context. (Try getting “Black lives matter!” to spread through an isolated Amazonian tribe.) So should we count the larger memetic context as part of the meme?
But if you stop trying to ask what is or isn’t a meme and you just look at the phenomenon, you can see something happening. In the BLM movement, the phrase “Silence is violence” evolved and spread because it was evocative and helped the whole movement combat opposition in a way that supported its egregoric possession.
So… where does the whole BLM superorganism live? In its believers and supporters, sure. But also in its opponents. (Think of how folk who opposed BLM would spread its claims in order to object to them.) Also on webpages. Billboards. Now in Hollywood movies. And it’s always shifting and mutating.
The academic field of memetics died because they couldn’t formally define “meme”. But that’s backwards. Biology didn’t need to formally define life to recognize that there’s something to study. The act of studying seems to make some definitions more possible.
That’s where we’re at right now. Egregoric zoology, post Darwin but pre Watson & Crick.
I quite agree. I didn’t mean to imply otherwise.
The thing is, unFriendly hypercreatures aren’t thinking about aligning AI to hypercreatures either. They have very little foresight.
(This is an artifact of how most unFriendly egregores do their thing via stupefaction. Most possessed people can’t think about the future because it’s too real and involves things like their personal death. They instead think about symbolic futures and get sideswiped when reality predictably doesn’t go according to their plans. So since unFriendly hypercreatures use stupefied minds to plan, they end up having trouble with long futures, ergo unable to sanely orient to real-world issues that in fact screw them over.)
I think these hypercreatures will get just as shocked as the rest of us when AGI comes online.
The thing is, the pathway by which something like AGI actually destroys us is some combo of (a) getting a hold of real-world systems like nukes and (b) hacking human minds to do its bidding. Both of these are already happening via unFriendly hypercreature evolution, and for exactly the same reasons that folk are fearing AI risk.
The creation of digital AGI just finishes moving the substrate off of humans, at which point the emergent unFriendly superintelligence no longer has any reason to care about human bodies or minds. At that point we lose all leverage.
That’s why I’m looking at the current situation and saying “Hey guys, I think you’re missing what’s actually happening here. We’re already in AI takeoff, and you’re fixated on the moment we lose all control instead of on this moment where we still have some.”
I think of the step to AGI as the final one, when some egregore figures out how to build a memetic nuke but doesn’t realize it’ll burn everything.
So, no magical meme transforming into a digital form.
(Although it’s some company or whatever that will specify to the AGI “Make paperclips” or whatever. God forbid some corporate egregore builds an AGI to “maximize profit”.)
Faster mutation rate doesn’t just produce faster evolution—it also reduces the steady-state fitness. Complex machinery can’t reliably be evolved if pieces of it are breaking all the time. I’m mostly relying No Evolutions for Corporations or Nanodevices plus one undergrad course in evolutionary bio here.
Thank you for pointing this out. I agree with the empirical observation that we’ve had some very virulent and impactful memes. I’m skeptical about saying that those were produced by evolution rather than something more like genetic drift, because of the mutation-rate argument. But given that observation, I don’t know if it matters if there’s evolution going on or not. What we’re concerned with is the impact, not the mechanism.
I think at this point I’m mostly just objecting to the aesthetic and some less-rigorous claims that aren’t really important, not the core of what you’re arguing. Does it just come down to something like:
“Ideas can be highly infectious and strongly affect behavior. Before you do anything, check for ideas in your head which affect your behavior in ways you don’t like. And before you try and tackle a global-scale problem with a small-scale effort, see if you can get an idea out into the world to get help.”