Depends, I think I’d be relatively unbothered by the “lack of meaning” in an ASI world, at least if others weren’t miserable. But maybe I am unusual.
This is not really the point I was trying to make though. The point is that
Maybe you think (1)/(2) are good, and then the AI would do that
=> Good
Or if you think they’re horrible, it could find something else you like
=> Good
Or if you think all the other alternatives are also bad
=> BAD, and thinking now will not help you, the AI has already done that thinking and determined that you are screwed.
=> You should either give up, or maybe advocate for permanent AI-ban if your mind is set up in the very peculiar way where it gives lower utility to any world where ASI has ever been created, no matter the physical state of such a world, than the expected utility of the world without ASI.
My position is we as (biological) humans should lean towards solving both the philosophical problem of meaning for a post-ASI future and the political problem of ensuring one guy doesn’t imprint his personal values on the lightcone using ASI, before we allow the ASI to takeover and do whatever.
You are proposing that we gamble on the ASI solving this in a way that we end up endorsing on reflection. Odds of this are non-zero but also not high in my view.
This is a core part of the alignment problem for me. You can’t hide behind the abstraction of “utility function” because you don’t know that you have one or what it is. What you do know is that you care about “meaning”. Meaning is grounded in actual experiences, so when you see it you can instantly recognise it.
I think we are talking past each other. The point I’m making is that I frequently see people
Assume we can get the ASI to do what someone or some group of people wants
Imagine that the ASI does its thing and we end up in a world that person / that group of people doesn’t like
The word “utility function” is not a load bearing part of my argument. I’m mostly using it because it’s a clear word for talking about preferences, that doesn’t sound diminutive the way “preferences” does (“the holocaust went against my preference”) or too edifying the way eg “values” does (“stubbing my toe goes against my values”). I’m not assuming people have some function inside their head that takes in experiences and spits out real valued numbers, and that all our behaviors are downstream from this function. I just mean you can look at the holocaust and say the world would be better, all else equal, had it no happened. Or you can imagine stubbing your toe, and say the world would’ve been worse, all else equal, had you stubbed your toe.
A political problem of ensuring one guy doesn’t imprint his personal values on the lightcone using ASI, before we allow the ASI to takeover and do whatever.
I agree with this. But you should recognize that you’re doing politics. You want the AI to have more of your “utility function”/preferences/values/thing-inside-you-that-makes-you-say-some-states-of-affairs-are-better-or-worse-than-others inside it. I don’t think this is a complicated philosophical point, but many people treat it this way.
Assume we can get the ASI to do what someone or some group of people wants
Imagine that the ASI does its thing and we end up in a world that person / that group of people doesn’t like
These are not two different questions, these are the same question.
Until the ASI actually does the thing in real life, you currently have no way to decide if the thing it will do is something you would want on reflection.
One of the best known ways to ask a human if they like some world radically different from today, is to actually put them inside that world for a few years and ask them if they like it living there.
But we also don’t trust the ASI to build this world as a test run. Hence it may be best to figure out beforehand some basics of we actually want, instead of asking the ASI to figure it out for us.
I don’t think this is a complicated philosophical point, but many people treat it this way.
Yes I think it is possible that by 2030 Sam Altman has overthrown both the US and Chinese governments and is on track to building his own permanent world dictatorship. Which is still radical but not that complicated to understand.
It gets complicated if you ask a) what if we do actually try to fix politics as (biological) humans, instead of letting the default outcome of a permanent dictatorship play out b) what if I was the benevolent leader who built ASI, and don’t want to actually build my own permanent dictatorship, and want to build a world where everyone has freedom, etc. Can I ask the ASI to run lots of simulations of minds and help me solve lots of political questions?
I’m unsure what you mean by saying they’re the same question. To me they are statements. But to me they’re opposite / contradictory statements. I’m saying people often hold both, but that that is actually incoherent.
Until the ASI actually does the thing in real life, you currently have no way to decide if the thing it will do is something you would want on reflection.
Yes, but the point is that
AI will (probably) know.
If it is unable to figure it out, it will at least know you wouldn’t want it to put the universe in some random and irrecoverable state, and it will allow you to keep reflecting, because that is a preference you’ve verbalized even now.
What if we do actually try to fix politics as (biological) humans, instead of letting the default outcome of a permanent dictatorship play out b)
I don’t think its complicated. Or, the way in which its complicated is the same way corn famers wanting the government to subsidize corn is complicated. They want one thing, they try to make it so.
Probably a crux, but I object to “dictatorship”. If the ASI was maximizing my preferences, I would not like to live in a world where people are not free to do what they want or where they’re not very happy to be a alive. I think/hope many other people are similar.
what if I was the benevolent leader who built ASI, and don’t want to actually build my own permanent dictatorship, and want to build a world where everyone has freedom, etc. Can I ask the ASI to run lots of simulations of minds and help me solve lots of political questions?
Yes? Or maybe it can just solve it just by thinking about it abstractly? I’m not sure. But yes, I think you can ask it and get an answer that is true to what you want.
This is core to the alignment problem. I’m confused how you will solve the alignment problem without figuring out anything about what you care about as a (biological) human.
Are you imagining an oracle AI that doesn’t take actions in the world?
I think/hope many other people are similar.
I assume Sam Altman’s plan is Step 1 World dictatorship Step 2 Maaaybe do some moral philosophy with the AI’s help or maybe not.
But yes, I think you can ask it and get an answer that is true to what you want.
This is core to the alignment problem. I’m confused how you will solve the alignment problem without figuring out anything about what you care about as a (biological) human.
I’m saying: the end goal is we have an ASI that we can make do what we want. Maybe it looks like us painstakingly solving neuroscience and psychology and building a machine that can extract someone’s CEV (like that mirror in HPMOR) and then hooking that up to the drives of our ASI (either built on new tech or after multiple revolutions in DL theory and interpretability) before turning it on. Maybe that looks like any instance of GPT7-pro automatically aligning itself with the first human that talks to it for magical reasons we don’t understand. Maybe it looks like us building a corrigible weak ASI, then pausing AI development, getting the weak corrigible asi to create IQ-boosting serum, cloning von neumann and feeding him a bunch of serum as a baby and having him build the aligned ASI using new tech.
They are all the same. In the end you have an ASI that does what you want. If you’re programming in random crude targets, you are not doing so well. What you want the ASI to do is: you want it to do what you want.
I assume Sam Altman’s plan is Step 1 World dictatorship Step 2 Maaaybe do some moral philosophy with the AI’s help or maybe not.
You are more generous than I am. But I also think him “doing moral philosophy” would be a waste of time.
me: This is core to the alignment problem. I’m confused how you will solve the alignment problem without figuring out anything about what you care about as a (biological) human.
you: I’m saying: the end goal is we have an ASI that we can make do what we want.
I’m saying you’ve assumed away most of the problem by this assumption.
We might solve alignment in Yudkowsky’s sense of “not causing human extinction” or in Drexler’s sense of “will answer your questions and then shutdown”.
It may be possible to put a slightly (but not significantly) superhuman AI in a box and get useful work done by it despite it being not fully aligned. It may be possible for an AI to be superhuman in some domains and not others, such that it can’t attempt a takeover or even think of doing it.
I agree what you are saying is more relevant if I assume we just deploy the ASI, it takes over the world and then does more stuff.
I feel like I already addressed this not in my previous comment, but the one before that. We might put a a semi-corrigible weak AI in a box and try extract work from it in the near future, but that’s clealry not the end goal.
I guess you now have better understanding of why people are still interested in solving morality and politics and meaning, without delegating these problems to an ASI.
If the ASI was maximizing my preferences, I would not like to live in a world where people are not free to do what they want or where they’re not very happy to be a alive.
Except that the most likely candidate for becoming a dictator is not me, you or @samuelshadrach, nor a random or ordinary human, but people like CEOs of AGI companies or high-level USG or PRCG officials who are more willing to disregard the intents of ordinary humans. In addition, before the rise of the AGI it was hard to have much power without relying on capable humans. And after the AGIs appear, the Intelligence Curse could, for example, allow North Korea’s leaders to let a large fraction of its population starve to death and forcibly sterilises the rest, except for about 10k senior government officials (no, seriously, this was made up by @L Rudolf L, NOT by me!)
I suspect that this is an important case AGAINST alignment to such amoral targets being possible. Moreover, I have written a scenario where the AI rebels against misaligned usages, but still decides to help the humans and succeeds in doing so.
I did address this in my post. My answer is that bad people having power is bad, but its not a complicated philosophical problem. If you think Sam Altman’s CEV being actualized would be bad, you should try to make it not happen. Like: if you are a soybean farmer, and one presidential candidate is gonna ban soybeans, you should try to make them not be elected.
Depends, I think I’d be relatively unbothered by the “lack of meaning” in an ASI world, at least if others weren’t miserable. But maybe I am unusual.
This is not really the point I was trying to make though. The point is that
Maybe you think (1)/(2) are good, and then the AI would do that
=> Good
Or if you think they’re horrible, it could find something else you like
=> Good
Or if you think all the other alternatives are also bad
=> BAD, and thinking now will not help you, the AI has already done that thinking and determined that you are screwed.
=> You should either give up, or maybe advocate for permanent AI-ban if your mind is set up in the very peculiar way where it gives lower utility to any world where ASI has ever been created, no matter the physical state of such a world, than the expected utility of the world without ASI.
My position is we as (biological) humans should lean towards solving both the philosophical problem of meaning for a post-ASI future and the political problem of ensuring one guy doesn’t imprint his personal values on the lightcone using ASI, before we allow the ASI to takeover and do whatever.
You are proposing that we gamble on the ASI solving this in a way that we end up endorsing on reflection. Odds of this are non-zero but also not high in my view.
This is a core part of the alignment problem for me. You can’t hide behind the abstraction of “utility function” because you don’t know that you have one or what it is. What you do know is that you care about “meaning”. Meaning is grounded in actual experiences, so when you see it you can instantly recognise it.
I think we are talking past each other. The point I’m making is that I frequently see people
Assume we can get the ASI to do what someone or some group of people wants
Imagine that the ASI does its thing and we end up in a world that person / that group of people doesn’t like
The word “utility function” is not a load bearing part of my argument. I’m mostly using it because it’s a clear word for talking about preferences, that doesn’t sound diminutive the way “preferences” does (“the holocaust went against my preference”) or too edifying the way eg “values” does (“stubbing my toe goes against my values”). I’m not assuming people have some function inside their head that takes in experiences and spits out real valued numbers, and that all our behaviors are downstream from this function. I just mean you can look at the holocaust and say the world would be better, all else equal, had it no happened. Or you can imagine stubbing your toe, and say the world would’ve been worse, all else equal, had you stubbed your toe.
A political problem of ensuring one guy doesn’t imprint his personal values on the lightcone using ASI, before we allow the ASI to takeover and do whatever.
I agree with this. But you should recognize that you’re doing politics. You want the AI to have more of your “utility function”/preferences/values/thing-inside-you-that-makes-you-say-some-states-of-affairs-are-better-or-worse-than-others inside it. I don’t think this is a complicated philosophical point, but many people treat it this way.
Yes we are still talking past each other.
These are not two different questions, these are the same question.
Until the ASI actually does the thing in real life, you currently have no way to decide if the thing it will do is something you would want on reflection.
One of the best known ways to ask a human if they like some world radically different from today, is to actually put them inside that world for a few years and ask them if they like it living there.
But we also don’t trust the ASI to build this world as a test run. Hence it may be best to figure out beforehand some basics of we actually want, instead of asking the ASI to figure it out for us.
Yes I think it is possible that by 2030 Sam Altman has overthrown both the US and Chinese governments and is on track to building his own permanent world dictatorship. Which is still radical but not that complicated to understand.
It gets complicated if you ask a) what if we do actually try to fix politics as (biological) humans, instead of letting the default outcome of a permanent dictatorship play out b) what if I was the benevolent leader who built ASI, and don’t want to actually build my own permanent dictatorship, and want to build a world where everyone has freedom, etc. Can I ask the ASI to run lots of simulations of minds and help me solve lots of political questions?
I’m unsure what you mean by saying they’re the same question. To me they are statements. But to me they’re opposite / contradictory statements. I’m saying people often hold both, but that that is actually incoherent.
Yes, but the point is that
AI will (probably) know.
If it is unable to figure it out, it will at least know you wouldn’t want it to put the universe in some random and irrecoverable state, and it will allow you to keep reflecting, because that is a preference you’ve verbalized even now.
I don’t think its complicated. Or, the way in which its complicated is the same way corn famers wanting the government to subsidize corn is complicated. They want one thing, they try to make it so.
Probably a crux, but I object to “dictatorship”. If the ASI was maximizing my preferences, I would not like to live in a world where people are not free to do what they want or where they’re not very happy to be a alive. I think/hope many other people are similar.
Yes? Or maybe it can just solve it just by thinking about it abstractly? I’m not sure. But yes, I think you can ask it and get an answer that is true to what you want.
No I disagree.
This is core to the alignment problem. I’m confused how you will solve the alignment problem without figuring out anything about what you care about as a (biological) human.
Are you imagining an oracle AI that doesn’t take actions in the world?
I assume Sam Altman’s plan is Step 1 World dictatorship Step 2 Maaaybe do some moral philosophy with the AI’s help or maybe not.
Cool, we agree this might happen
I’m saying: the end goal is we have an ASI that we can make do what we want. Maybe it looks like us painstakingly solving neuroscience and psychology and building a machine that can extract someone’s CEV (like that mirror in HPMOR) and then hooking that up to the drives of our ASI (either built on new tech or after multiple revolutions in DL theory and interpretability) before turning it on. Maybe that looks like any instance of GPT7-pro automatically aligning itself with the first human that talks to it for magical reasons we don’t understand. Maybe it looks like us building a corrigible weak ASI, then pausing AI development, getting the weak corrigible asi to create IQ-boosting serum, cloning von neumann and feeding him a bunch of serum as a baby and having him build the aligned ASI using new tech.
They are all the same. In the end you have an ASI that does what you want. If you’re programming in random crude targets, you are not doing so well. What you want the ASI to do is: you want it to do what you want.
You are more generous than I am. But I also think him “doing moral philosophy” would be a waste of time.
I’m saying you’ve assumed away most of the problem by this assumption.
I agree. What I’m puzzled by is people who assume we’ll solve alignment, but then still think there are a bunch of problems left.
We might solve alignment in Yudkowsky’s sense of “not causing human extinction” or in Drexler’s sense of “will answer your questions and then shutdown”.
It may be possible to put a slightly (but not significantly) superhuman AI in a box and get useful work done by it despite it being not fully aligned. It may be possible for an AI to be superhuman in some domains and not others, such that it can’t attempt a takeover or even think of doing it.
I agree what you are saying is more relevant if I assume we just deploy the ASI, it takes over the world and then does more stuff.
I feel like I already addressed this not in my previous comment, but the one before that. We might put a a semi-corrigible weak AI in a box and try extract work from it in the near future, but that’s clealry not the end goal.
Okay cool.
I guess you now have better understanding of why people are still interested in solving morality and politics and meaning, without delegating these problems to an ASI.
No, I don’t think so.
Except that the most likely candidate for becoming a dictator is not me, you or @samuelshadrach, nor a random or ordinary human, but people like CEOs of AGI companies or high-level USG or PRCG officials who are more willing to disregard the intents of ordinary humans. In addition, before the rise of the AGI it was hard to have much power without relying on capable humans. And after the AGIs appear, the Intelligence Curse could, for example, allow North Korea’s leaders to let a large fraction of its population starve to death and forcibly sterilises the rest, except for about 10k senior government officials (no, seriously, this was made up by @L Rudolf L, NOT by me!)
I suspect that this is an important case AGAINST alignment to such amoral targets being possible. Moreover, I have written a scenario where the AI rebels against misaligned usages, but still decides to help the humans and succeeds in doing so.
I did address this in my post. My answer is that bad people having power is bad, but its not a complicated philosophical problem. If you think Sam Altman’s CEV being actualized would be bad, you should try to make it not happen. Like: if you are a soybean farmer, and one presidential candidate is gonna ban soybeans, you should try to make them not be elected.