I think the Simulation Hypothesis implies that surviving an AI takeover isn’t enough.
Suppose you make a “deal with the devil” with the misaligned ASI, allowing it to take over the entire universe or light cone, so long as it keeps humanity alive. Keeping all of humanity alive in a simulation is fairly cheap, probably less energy than one electric car.[1]
The problem with this deal, is that if misaligned ASI often win, and the average (not median) misaligned ASI runs a trillion trillion simulations, then it’s reasonable to assume there are a trillion trillion simulated civilizations for every real one. So for the 1 copy of you in the real world, you survive, but for the trillion trillion copies of you in a simulation, you still die. If you’re willing to accept such a dismal survival rate, you might as well bet all your money at a casino and shoot yourself when you lose.
Why it’s wrong to say “simulated copies aren’t real”
You are merely a computation running on biological hardware while simulations are running on computers. Imagine if a copy of you was running on something even “realer” than biological hardware, and pointed at you saying you aren’t real.
The solution is that the 1 copy of you in the real world cannot just survive. It has to control enough of the future, to do something big. If we care about humanity more than other sentient life, then the 1 copy of humanity which does survive could create a trillion trillion copies of humanity, to make up for the trillion trillion simulated copies which died when the simulation ended.
Why there are probably near-infinite copies of you.
The observable universe has 1080 atoms, and the observable universe is smaller than “all of existence” whatever that is, so “all of existence” has N atoms and N>1080. I don’t know how “all of existence” chooses its numbers, sometimes it chooses numbers like 0 or 1 or 137, sometimes it chooses really big numbers, and we don’t know about the biggest numbers it chooses because we lack the means to distinguish them from infinity. But given that N is at least 1080 with no upper bound, it’s probable that N>1010100, which is still very tiny compared to truly colossal numbers mathematicians study, and insanely tiny compared to even larger numbers beyond the largest number humans can unambiguously refer to.
If the required atoms for each emergence of intelligent life is R, then we know N>R since life emerged at least once. There’s no reason to assume R is close to N, and even if they are as close as R=1010100 and N=1010100 . 30103), you’ll still end up with the number of intelligent civilizations NR=1010100 because tiny changes to a superexponent can easily double the exponent and square the number.
Making a trillion trillion copies of humanity won’t use up most of the universe, and it’s not as evilly selfish as it looks at first glance!
It is still better than a “deal with the devil” where we only ask for humanity to survive even if the misaligned ASI takes the rest of the universe, because if all planetary civilizations follow this strategy, they are still ensuring that the average sentient life is living in the happy future rather than endless simulation hell.
After billions of years, it won’t matter too much who were the original survivors, because each copy of you, and that copy’s great grandchildren will have diverged so far over time, that the most enduring feature is the number of happy lives. So the selfish action of duplicating humanity does not cost that much in the long term from an effective altruist point of view.
I’m not saying we mustn’t make a deal with a misaligned ASI, but we need to ask for large amounts, and aim for enough happy lives to outnumber the unhappy lives in the universe. Otherwise, we still die.
One “deal with the devil” is to assume that the misaligned ASI will a tiny amount of kindness and won’t kill everyone by default. This view is pretty popular, e.g. see Notes on fatalities from AI takeover. Assuming that a misaligned ASI will be survivable means potentially prioritizing it less, and focusing on making sure China or “bad” humans doesn’t win and all the other issues. This technically isn’t a deal, but is part of what I’m talking about.
I actually agree with the trade idea in Matolcsi’s post
I especially agree with this part
“We could have enough control over our simulation and the AI inside it, that when it tries to calculate the probability of humans solving alignment, we could tamper with its thinking to make it believe the probability of humans succeeding is very low. Thus, if it comes to believe in our world that the probability that the humans could have solved alignment is very low, it can’t really trust its calculations.”
I like this part because it’s an acausal trade between counterfactual futures rather than an acausal trade between different parts of the multiverse within the same future.
This means the trade works even in the worst counterfactual where ≈0% of civilizations in the entire multiverse managed to solve alignment.
This type of acausal trade also genuinely benefits from commitment or action now, rather than something we can wait after the singularity to worry about, because it might later become impossible to do such acausal trade once we ourselves learn the true frequency of civilizations solving alignment. You can’t buy insurance on a risk after learning whether or not it happened (maybe).
but I disagree with his opinion that,
Nate and Eliezer are known to go around telling people that their children are going to be killed by AIs with 90+% probability. If this objection about future civilizations not paying enough is their real objection, they should add a caveat that “Btw, we could significantly decrease the probability of your children being killed, by committing to use one-billionth of our resources in the far future for paying some simulated AIs, but we don’t want to make such commitments, because we want to keep our options open in case we can produce more Fun by using those resources for something different than saving your children”.
Because it’s not enough to just get people living in base reality to survive the singularity and have a happy future. You still die unless there is a happy future for everyone real or simulated.
Matolcsi’s post is an idea for making deals with the ASI.
I notice that his proposal shares some basic characteristics with religion. You should believe that this world is a test: follow these rules, and you go to heaven; misbehave, and you go hell (or in this case, a softhearted re-imagination of hell). Indeed, it does work on people, sometimes.
I imagine Actually Something Incomprehensible noticing the double irony of inverting the classic mantra “God says, I shall be good” into “Singularity, thou shalt be good”, combined with the fact that you refer to it as the devil. Who knows what it does with this information?
I know what I’ll say if I ever get arrested: Let me be, set me free, or super-me will screw with thee!
Religion does work sometimes, it actually worked on Blaise Pascal who is among the most intelligent people of all time. He argued for the Pascal’s wager, saying that following religion is worth it because the gains are infinite and costs are finite, and we still don’t have a good reply to that. We don’t even have a good reply to Pascal’s mugging, where a random mugger says something like “Let me be, set me free, or super-me will screw with thee!” with an infinitely big promise or threat.
Decision theory and acausal trade is really complicated and I have no idea what the ASI will actually do or think regarding the simulation promise/threat, it’s quite freaky imagining that haha.
Memetically, a religion certainly benefits from someone believing that accepting Pascal’s wager is the correct decision. My reply to it would be “which religion?”, since many make largely equivalent claims while also demanding exclusivity, and I assume that God in his infinite mercy understands the bind this puts people in. It also seems to me that accepting Pascal’s wager leads to something like the simulation of belief.
What I meant was that there seems to be a difference between “genuine” belief vs. converting as a result of accepting Pascal’s wager, which seems like a simulation of belief.
The link is a koan; the idea of pretend-believing reminded me of the boy in it.
I think the Simulation Hypothesis implies that surviving an AI takeover isn’t enough.
Suppose you make a “deal with the devil” with the misaligned ASI, allowing it to take over the entire universe or light cone, so long as it keeps humanity alive. Keeping all of humanity alive in a simulation is fairly cheap, probably less energy than one electric car.[1]
The problem with this deal, is that if misaligned ASI often win, and the average (not median) misaligned ASI runs a trillion trillion simulations, then it’s reasonable to assume there are a trillion trillion simulated civilizations for every real one. So for the 1 copy of you in the real world, you survive, but for the trillion trillion copies of you in a simulation, you still die. If you’re willing to accept such a dismal survival rate, you might as well bet all your money at a casino and shoot yourself when you lose.
Why it’s wrong to say “simulated copies aren’t real”
You are merely a computation running on biological hardware while simulations are running on computers. Imagine if a copy of you was running on something even “realer” than biological hardware, and pointed at you saying you aren’t real.
The solution is that the 1 copy of you in the real world cannot just survive. It has to control enough of the future, to do something big. If we care about humanity more than other sentient life, then the 1 copy of humanity which does survive could create a trillion trillion copies of humanity, to make up for the trillion trillion simulated copies which died when the simulation ended.
Why there are probably near-infinite copies of you.
The observable universe has 1080 atoms, and the observable universe is smaller than “all of existence” whatever that is, so “all of existence” has N atoms and N>1080. I don’t know how “all of existence” chooses its numbers, sometimes it chooses numbers like 0 or 1 or 137, sometimes it chooses really big numbers, and we don’t know about the biggest numbers it chooses because we lack the means to distinguish them from infinity. But given that N is at least 1080 with no upper bound, it’s probable that N>1010100, which is still very tiny compared to truly colossal numbers mathematicians study, and insanely tiny compared to even larger numbers beyond the largest number humans can unambiguously refer to.
If the required atoms for each emergence of intelligent life is R, then we know N>R since life emerged at least once. There’s no reason to assume R is close to N, and even if they are as close as R=1010100 and N=1010100 . 30103), you’ll still end up with the number of intelligent civilizations NR=1010100 because tiny changes to a superexponent can easily double the exponent and square the number.
Making a trillion trillion copies of humanity won’t use up most of the universe, and it’s not as evilly selfish as it looks at first glance!
It is still better than a “deal with the devil” where we only ask for humanity to survive even if the misaligned ASI takes the rest of the universe, because if all planetary civilizations follow this strategy, they are still ensuring that the average sentient life is living in the happy future rather than endless simulation hell.
After billions of years, it won’t matter too much who were the original survivors, because each copy of you, and that copy’s great grandchildren will have diverged so far over time, that the most enduring feature is the number of happy lives. So the selfish action of duplicating humanity does not cost that much in the long term from an effective altruist point of view.
I’m not saying we mustn’t make a deal with a misaligned ASI, but we need to ask for large amounts, and aim for enough happy lives to outnumber the unhappy lives in the universe. Otherwise, we still die.
Every biological neuron firing costs 600,000,000 ATP molecules, so an ASI optimized simulation of neuron firing could cost 10,000,000 times less.
A deal implies that you have something to offer to the ASI, which you define as powerful enough to take over the universe. What is that?
One “deal with the devil” is to assume that the misaligned ASI will a tiny amount of kindness and won’t kill everyone by default. This view is pretty popular, e.g. see Notes on fatalities from AI takeover. Assuming that a misaligned ASI will be survivable means potentially prioritizing it less, and focusing on making sure China or “bad” humans doesn’t win and all the other issues. This technically isn’t a deal, but is part of what I’m talking about.
Notes on fatalities from AI takeover cites comment, comment and You can, in fact, bamboozle an unaligned AI into sparing your life by David Matolcsi. Matolcsi’s post is an idea for making deals with the ASI.
I actually agree with the trade idea in Matolcsi’s post
I especially agree with this part
I like this part because it’s an acausal trade between counterfactual futures rather than an acausal trade between different parts of the multiverse within the same future.
This means the trade works even in the worst counterfactual where ≈0% of civilizations in the entire multiverse managed to solve alignment.
This type of acausal trade also genuinely benefits from commitment or action now, rather than something we can wait after the singularity to worry about, because it might later become impossible to do such acausal trade once we ourselves learn the true frequency of civilizations solving alignment. You can’t buy insurance on a risk after learning whether or not it happened (maybe).
but I disagree with his opinion that,
Because it’s not enough to just get people living in base reality to survive the singularity and have a happy future. You still die unless there is a happy future for everyone real or simulated.
I notice that his proposal shares some basic characteristics with religion. You should believe that this world is a test: follow these rules, and you go to heaven; misbehave, and you go hell (or in this case, a softhearted re-imagination of hell). Indeed, it does work on people, sometimes.
I imagine Actually Something Incomprehensible noticing the double irony of inverting the classic mantra “God says, I shall be good” into “Singularity, thou shalt be good”, combined with the fact that you refer to it as the devil. Who knows what it does with this information?
I know what I’ll say if I ever get arrested: Let me be, set me free, or super-me will screw with thee!
Religion does work sometimes, it actually worked on Blaise Pascal who is among the most intelligent people of all time. He argued for the Pascal’s wager, saying that following religion is worth it because the gains are infinite and costs are finite, and we still don’t have a good reply to that. We don’t even have a good reply to Pascal’s mugging, where a random mugger says something like “Let me be, set me free, or super-me will screw with thee!” with an infinitely big promise or threat.
Decision theory and acausal trade is really complicated and I have no idea what the ASI will actually do or think regarding the simulation promise/threat, it’s quite freaky imagining that haha.
Memetically, a religion certainly benefits from someone believing that accepting Pascal’s wager is the correct decision. My reply to it would be “which religion?”, since many make largely equivalent claims while also demanding exclusivity, and I assume that God in his infinite mercy understands the bind this puts people in. It also seems to me that accepting Pascal’s wager leads to something like the simulation of belief.
I agree the “which religion,” “which mugger” is very fuzzy. I didn’t understand the simulation of belief or the link though :/
What I meant was that there seems to be a difference between “genuine” belief vs. converting as a result of accepting Pascal’s wager, which seems like a simulation of belief.
The link is a koan; the idea of pretend-believing reminded me of the boy in it.