This is a question of fact, not a fight between different preferences.
I am disagreeing on the question of fact. What we can do without an FAI is by far superior to any scraps we can expect smiley-face maximiser to contribute due to exchanges.The greatest of the existential risks that not having an FAI entails is the threat of an uFAI. anti-AI removes that. We do have some potential for survival based on other technologies within our grasp. SIAI would have to devote itself to solving other hard problems.
Wei mentioned a combinatorial explosion. He may have been applying it somewhat differently than I am but I am claiming that an overwhelming number of the possible mind designs that Smiley is bargaining with are also bad for me. He is bargaining with a whole lot of Clippy’s brothers and sisters. Bargaining with a whole lot of GAIs that are released that care primarily about their own propagation. Even more importantly that small proportion of FAIs that do exist are not friendly to things I care about. Almost none of them will result in me personally being alive.
This all assumes that the bargaining does in fact go ahead. I’m not certain either way either and nor am I certain that in the specific case of Smiley one of his optimal trading partners will be an FAI which I happen to like.
All this means that I am comfortable with the assertion you quote. If you or Rolf did try to stop me from pressing that no-AI button then you would just be obstacles that needed to be eliminated, even if your motives are pure. My life and all that I hold dear is at stake!
He is bargaining with a whole lot of Clippy’s brothers and sisters.
I think that makes some sense. It’s not clear to me that building a smiley-face maximizer that trades with AIs in other possible worlds would be better than having a no-AI future.
There is another possibility to consider though. Both we and the smiley-face maximizer would be better off if we did allow it to be built, and then it gives our preferences some control (enough for us to be better off than the no-AI future). It’s not clear that this opportunity for trade can be realized, but we should spend some time thinking about it before ruling it out.
It seems like we really need a theory of games that tells us (human beings) how to play games with superintelligences. We can’t depend on our FAIs to play the games for us, because we have to decide now what to do, including the above example, and also what kind of FAI to build.
Both we and the smiley-face maximizer would be better off if we did allow it to be built, and then it gives our preferences some control (enough for us to be better off than the no-AI future). It’s not clear that this opportunity for trade can be realized, but we should spend some time thinking about it before ruling it out.
Sounds like Drescher’s bounded Newcomb. This perspective suddenly painted it FAI-complete.
Even more importantly that small proportion of FAIs that do exist are not friendly to things I care about.
That FAI is good for you is a property of the term “FAI”. If it doesn’t create value for you, it’s not FAI, but something like Smileys and Paperclippers, potential trade partner but not your guy. Let’s keep it simple.
That FAI is good for you is a property of the term “FAI”.
“Friendly to their Creator AI”, choose an acronym. Perhaps FAI. Across the multiverse most civilizations that engage in successful AI efforts will produce an AI that is not friendly to me. AIs that are actually FAIs (which include by definition my own survival) are negligible.
Let’s keep it simple.
Releasing a Smiley will make me die and destroy everything I care about. I will kill anyone who stops me preventing that disaster. That is as simple as I can make it.
AIs that are actually FAIs (which include by definition my own survival) are negligible.
Formal preference is a more abstract concept than survival in particular, and even though all else equal, in usual situations, survival is preferable to non-survival, there could be situations even better than “survival”. It’s not out of the question “by definition” (you know better than to invoke this argument pattern).
Formal preference is one particular thing. You can’t specify additional details without changing the concept. If preference says that “survival” is a necessary component, that worlds without “survival” are equally worthless, then so be it. But it could say otherwise. You can’t study something and already know the answer, you can’t just assume to know that this property that intuitively appeals to you is unquestionably present. How do you know? I’d rather build on clear foundation, and remain in doubt about what I can’t yet see.
That FAI is good for you is a property of the term “FAI”.
“Friendly to their Creator AI”, choose an acronym. Perhaps FAI. Across the multiverse most civilizations that engage in successful AI efforts will produce an AI that is not friendly to me. AIs that are actually FAIs (which include by definition my own survival) are negligible.
Negligible, non-negligible, that’s what the word means. It talks about specifically working for your preference, because of what AI values and not because it needs to do so for trade. FAI could be impossible, for example, that doesn’t change the concept. BabyEater’s AI could be an UFAI, or it could be a FAI, depending on how well it serves your preference. It could turn out to be a FAI, if the sympathy aspect of their preference is strong enough to dole you a fair part of the world, more than you own by pure game-theoretic control.
FAI doesn’t imply full control given to your preference (for example, here on Earth we have many people with at least somewhat different preferences, and all control likely won’t be given to any single person). The term distinguishes AIs that optimize for you because of their own preference (and thus generate powerful control in the mathematical universe for your values, to a much more significant extent than you can do yourself), from AIs that optimize for you because of control pressure (in other terms, trade opportunity) from another AI (which is the case for “UFAI”).
(I’m trying to factor the discussion into the more independent topics to not lose track of the structure of the argument.)
Releasing a Smiley will make me die and destroy everything I care about. I will kill anyone who stops me preventing that disaster. That is as simple as I can make it.
Please don’t derail a civilized course of discussion, this makes clear communication more expensive in effort. This particular point was about a convention for using a word, and not about that other point you started talking sarcastically about here.
Also, speculating on the consequences of a conclusion (like the implication from it being correct to not release the UFAI, to you therefore having to destroy everything that stands in the way of preventing that event, an implication with which I more or less agree, if you don’t forget to take into account the moral value of said murders) is not helpful in the course of arguing about which conclusion is the correct one.
This particular point was about a convention for using a word,
I engaged with your point and even accepted it.
and not about that other point you started talking sarcastically about here.
I reject your labeling attempt. My point is a fundamental disagreement with an important claim you are making and in no way sarcastic. Your comments here are attempting to redirect emphasis away from the point by re framing my disagreement negatively while completely ignoring my engagement with and acceptance of your point.
I also do not understand the “Let’s keep it simple” rhetoric. My misuse of the ‘FAI’ algorithm was oversimplifying for the purposes of brevity and I was willing to accept your more rigorous usage even though it requires more complexity.
Also, speculating on the consequences of a conclusion (like the implication from it being correct to not release the UFAI, to you therefore having to destroy everything that stands in the way of preventing that event, an implication with which I more or less agree, if you don’t forget to take into account the moral value of said murders) is not helpful in the course of arguing about which conclusion is the correct one.
I have previously discussed the benefits of the ‘kill test’ in considering moral choices when things really matter. This is one of the reasons Eliezer’s fan-fiction is so valuable. In particular I am influenced by Three Worlds Collide and The Sword of Good. I find that it is only that sort of stark consideration that can overcome certain biases that arise from moral squeamishness that is not evolved to handle big decisions. The “Ugh” reaction to things that “violate people’s rights” and to coercement bias us towards justifying courses of action so we don’t have to consider being ‘bad’. I may write a top level post on the subject (but there are dozens above it on my list).
This conversation is not one that I will enjoy continuing. I do not believe I am likely to make you update and nor do I expect to elicit new arguments that have not been considered. If something new comes along or if a top level post is created to consider the issues then I would be interested to read them and would quite probably re-engage.
I reject your labeling attempt. My point is a fundamental disagreement with an important claim you are making and in no way sarcastic. Your comments here are attempting to redirect emphasis away from the point by re framing my disagreement negatively while completely ignoring my engagement with and acceptance of your point.
I also do not understand the “Let’s keep it simple” rhetoric. My misuse of the ‘FAI’ algorithm was oversimplifying for the purposes of brevity and I was willing to accept your more rigorous usage even though it requires more complexity.
Okay, misunderstanding on both sides. From what I understood, there is no point in working on reaching agreement on this particular point of meta and rhetoric. (More substantial reply to the point we argue and attempt to reframe it for clarity are in the other two comments, which I assume you didn’t notice at the time of writing this reply.)
I have previously discussed the benefits of the ‘kill test’ in considering moral choices when things really matter.
Could you restate that (together with what you see as the disagreement, and the way “kill test” applies to this argument)? From what I remember, it’s a reference to intuitive conclusion: you resolve the moral disagreement on the side of what you actually believe to be right. It’s not a universally valid path to figuring out what’s actually right, intuitions are sometimes wrong (although it might be the only thing to go on when you need to actually make that decision, but it’s still decision-making under uncertainty, a process generally unrelated to truth-seeking).
Okay, misunderstanding on both sides. From what I understood, there is no point in working on reaching agreement on this particular point of meta and rhetoric. (More substantial reply to the point we argue and attempt to reframe it for clarity are in the other two comments, which I assume you didn’t notice at the time of writing this reply.)
Ok. And yes, I hadn’t seen the other comments (either not yet written or hidden among the other subjects in my inbox).
Sadly Vladimir this failure to understand stakeholder theory is endemic in AI discussions. Friendly AI cannot possibly be defined as being “if it doesn’t create value for you it’s not FAI” because value is arbitrary. There are some people who want to die and others to want to live being the stark example. Everyone being killed is thus value for some and not value for others and vice versa.
What we end up with is having to define friendly as being “creating value for the largest possible number of human stakeholders even if some of them lose”.
For example, someone who derives value from ordering people around or having everyone else be their personal slaves such as Caligula or the ex-dictator Gaddafi doesn’t (didn’t....) see value in self-rule for the people and thus fought hard to maintain the status quo, murdering many people in the process.
In any scenario whereby you consider the wants of those who seek most of the world’s resources or domination over others, you’re going to end up with an impossible conundrum for any putative FAI.
So given that scenario, what is really in all of our best interests if some of us aren’t going to get what we want and there is only one Earth?
So given that scenario, what is really in all of our best interests if some of us aren’t going to get what we want and there is only one Earth?
One answer I’ve seen is that the AI will create as many worlds as necessary in order to accommodate everyone’s desires in a reasonably satisfactory fashion. So, Gaddafi will get a world of his own, populated by all the people who (for some reason) enjoy being oppressed. If an insufficient number of such people exist, the FAI will create a sufficient number of non-sentient bots to fill out the population.
The AI can do all this because, as a direct consequence of its ability to make itself smarter exponentially, it will quickly acquire quasi-godlike powers, by, er, using some kind of nanotechnology or something.
By extrapolation it seems likely that the cheapest implementation of the different-worlds-for-conflicting-points of view is some kind of virtual reality if it proves too difficult to give each human it’s own material world.
Yes, and in the degenerate case, you’d have one world per human. But I doubt it would come to that, since a). we humans really aren’t as diverse as we think, and b). many of us crave the company of other humans. In any case, the FAI will be able to instantiate as many simulations as needed, because it has the aforementioned nano-magical powers.
Indeed. It’s likely that many of the simulations would be shared.
What I find interesting to speculate on then is whether we might be either forcibly scanned into the simulation or plugged into some kind of brain-in-a-vat scenario a la the matrix.
Perhaps the putative AI might make the calculation that most humans would ultimately be OK with one of those scenarios.
What I find interesting to speculate on then is whether we might be either forcibly scanned into the simulation or plugged into some kind of brain-in-a-vat scenario a la the matrix.
Meh… as far as I’m concerned, those are just implementation details. Once your AI gets a hold of those nano-magical quantum powers, it can pretty much do anything it wants, anyway.
If you or Rolf did try to stop me from pressing that no-AI button then you would just be obstacles that needed to be eliminated, even if your motives are pure. My life and all that I hold dear is at stake!
I understand that you don’t want to die or lose the future, and I understand the ingrained thought that UFAI = total loss, but please try to look past that, consider that you may be wrong, see that being willing to ‘eliminate’ your allies over factual disagreements loses, and cooperate in the iterated epistemic prisoner’s dilemma with your epistemic peers. You seem to be pretty obviously coming at this question from a highly emotional position, and should try to deal with that before arguing the object level.
What we can do without an FAI is by far superior to any scraps we can expect smiley-face maximiser to contribute due to exchanges.
That it’s far superior is not obvious, both because it’s not obvious how well we could reasonably expect to do without FAI (How likely would we be to successfully construct a singleton locking in our values? How efficiently could we use resources? Would the anti-AI interfere with human intelligence enhancement or uploading, either of which seems like it would destroy huge amounts of value?), and because our notional utility function might see steeply diminishing marginal returns to resources before using the entire future light cone (see this discussion).
I understand that you don’t want to die or lose the future, and I understand the ingrained thought that UFAI = total loss, but please try to look past that, consider that you may be wrong, see that being willing to ‘eliminate’ your allies over factual disagreements loses, and cooperate in the iterated epistemic prisoner’s dilemma with your epistemic peers.
I am, or at least was, considering the facts, including what was supplied in the links. I was also assuming for the sake of the argument that the kind of agent that the incompetent AI developers created would recursively improve to one that cooperated without communication with other universes.
You seem to be pretty obviously coming at this question from a highly emotional position, and should try to deal with that before arguing the object level.
Discussing the effects and implications of decisions in counterfactuals is not something that is at all emotional for me. It fascinates me. On the other hand the natural conclusion to counterfactuals (which are inevitably discussing extreme situations) is something that does seem to inspire emotional judgments, which is something that overrides my fascination.
I am disagreeing on the question of fact. What we can do without an FAI is by far superior to any scraps we can expect smiley-face maximiser to contribute due to exchanges.The greatest of the existential risks that not having an FAI entails is the threat of an uFAI. anti-AI removes that. We do have some potential for survival based on other technologies within our grasp. SIAI would have to devote itself to solving other hard problems.
Wei mentioned a combinatorial explosion. He may have been applying it somewhat differently than I am but I am claiming that an overwhelming number of the possible mind designs that Smiley is bargaining with are also bad for me. He is bargaining with a whole lot of Clippy’s brothers and sisters. Bargaining with a whole lot of GAIs that are released that care primarily about their own propagation. Even more importantly that small proportion of FAIs that do exist are not friendly to things I care about. Almost none of them will result in me personally being alive.
This all assumes that the bargaining does in fact go ahead. I’m not certain either way either and nor am I certain that in the specific case of Smiley one of his optimal trading partners will be an FAI which I happen to like.
All this means that I am comfortable with the assertion you quote. If you or Rolf did try to stop me from pressing that no-AI button then you would just be obstacles that needed to be eliminated, even if your motives are pure. My life and all that I hold dear is at stake!
I think that makes some sense. It’s not clear to me that building a smiley-face maximizer that trades with AIs in other possible worlds would be better than having a no-AI future.
There is another possibility to consider though. Both we and the smiley-face maximizer would be better off if we did allow it to be built, and then it gives our preferences some control (enough for us to be better off than the no-AI future). It’s not clear that this opportunity for trade can be realized, but we should spend some time thinking about it before ruling it out.
It seems like we really need a theory of games that tells us (human beings) how to play games with superintelligences. We can’t depend on our FAIs to play the games for us, because we have to decide now what to do, including the above example, and also what kind of FAI to build.
Sounds like Drescher’s bounded Newcomb. This perspective suddenly painted it FAI-complete.
Can you please elaborate? I looked up “FAI-complete”, and found this but I still don’t get your point.
See the DT list. (Copy of the post here.) FAI-complete problem = solving it means that FAI gets solved as well.
That FAI is good for you is a property of the term “FAI”. If it doesn’t create value for you, it’s not FAI, but something like Smileys and Paperclippers, potential trade partner but not your guy. Let’s keep it simple.
“Friendly to their Creator AI”, choose an acronym. Perhaps FAI. Across the multiverse most civilizations that engage in successful AI efforts will produce an AI that is not friendly to me. AIs that are actually FAIs (which include by definition my own survival) are negligible.
Releasing a Smiley will make me die and destroy everything I care about. I will kill anyone who stops me preventing that disaster. That is as simple as I can make it.
Formal preference is a more abstract concept than survival in particular, and even though all else equal, in usual situations, survival is preferable to non-survival, there could be situations even better than “survival”. It’s not out of the question “by definition” (you know better than to invoke this argument pattern).
Formal preference is one particular thing. You can’t specify additional details without changing the concept. If preference says that “survival” is a necessary component, that worlds without “survival” are equally worthless, then so be it. But it could say otherwise. You can’t study something and already know the answer, you can’t just assume to know that this property that intuitively appeals to you is unquestionably present. How do you know? I’d rather build on clear foundation, and remain in doubt about what I can’t yet see.
Negligible, non-negligible, that’s what the word means. It talks about specifically working for your preference, because of what AI values and not because it needs to do so for trade. FAI could be impossible, for example, that doesn’t change the concept. BabyEater’s AI could be an UFAI, or it could be a FAI, depending on how well it serves your preference. It could turn out to be a FAI, if the sympathy aspect of their preference is strong enough to dole you a fair part of the world, more than you own by pure game-theoretic control.
FAI doesn’t imply full control given to your preference (for example, here on Earth we have many people with at least somewhat different preferences, and all control likely won’t be given to any single person). The term distinguishes AIs that optimize for you because of their own preference (and thus generate powerful control in the mathematical universe for your values, to a much more significant extent than you can do yourself), from AIs that optimize for you because of control pressure (in other terms, trade opportunity) from another AI (which is the case for “UFAI”).
(I’m trying to factor the discussion into the more independent topics to not lose track of the structure of the argument.)
Please don’t derail a civilized course of discussion, this makes clear communication more expensive in effort. This particular point was about a convention for using a word, and not about that other point you started talking sarcastically about here.
Also, speculating on the consequences of a conclusion (like the implication from it being correct to not release the UFAI, to you therefore having to destroy everything that stands in the way of preventing that event, an implication with which I more or less agree, if you don’t forget to take into account the moral value of said murders) is not helpful in the course of arguing about which conclusion is the correct one.
I engaged with your point and even accepted it.
I reject your labeling attempt. My point is a fundamental disagreement with an important claim you are making and in no way sarcastic. Your comments here are attempting to redirect emphasis away from the point by re framing my disagreement negatively while completely ignoring my engagement with and acceptance of your point.
I also do not understand the “Let’s keep it simple” rhetoric. My misuse of the ‘FAI’ algorithm was oversimplifying for the purposes of brevity and I was willing to accept your more rigorous usage even though it requires more complexity.
I have previously discussed the benefits of the ‘kill test’ in considering moral choices when things really matter. This is one of the reasons Eliezer’s fan-fiction is so valuable. In particular I am influenced by Three Worlds Collide and The Sword of Good. I find that it is only that sort of stark consideration that can overcome certain biases that arise from moral squeamishness that is not evolved to handle big decisions. The “Ugh” reaction to things that “violate people’s rights” and to coercement bias us towards justifying courses of action so we don’t have to consider being ‘bad’. I may write a top level post on the subject (but there are dozens above it on my list).
This conversation is not one that I will enjoy continuing. I do not believe I am likely to make you update and nor do I expect to elicit new arguments that have not been considered. If something new comes along or if a top level post is created to consider the issues then I would be interested to read them and would quite probably re-engage.
Okay, misunderstanding on both sides. From what I understood, there is no point in working on reaching agreement on this particular point of meta and rhetoric. (More substantial reply to the point we argue and attempt to reframe it for clarity are in the other two comments, which I assume you didn’t notice at the time of writing this reply.)
Could you restate that (together with what you see as the disagreement, and the way “kill test” applies to this argument)? From what I remember, it’s a reference to intuitive conclusion: you resolve the moral disagreement on the side of what you actually believe to be right. It’s not a universally valid path to figuring out what’s actually right, intuitions are sometimes wrong (although it might be the only thing to go on when you need to actually make that decision, but it’s still decision-making under uncertainty, a process generally unrelated to truth-seeking).
Ok. And yes, I hadn’t seen the other comments (either not yet written or hidden among the other subjects in my inbox).
Sadly Vladimir this failure to understand stakeholder theory is endemic in AI discussions. Friendly AI cannot possibly be defined as being “if it doesn’t create value for you it’s not FAI” because value is arbitrary. There are some people who want to die and others to want to live being the stark example. Everyone being killed is thus value for some and not value for others and vice versa.
What we end up with is having to define friendly as being “creating value for the largest possible number of human stakeholders even if some of them lose”.
For example, someone who derives value from ordering people around or having everyone else be their personal slaves such as Caligula or the ex-dictator Gaddafi doesn’t (didn’t....) see value in self-rule for the people and thus fought hard to maintain the status quo, murdering many people in the process.
In any scenario whereby you consider the wants of those who seek most of the world’s resources or domination over others, you’re going to end up with an impossible conundrum for any putative FAI.
So given that scenario, what is really in all of our best interests if some of us aren’t going to get what we want and there is only one Earth?
One answer I’ve seen is that the AI will create as many worlds as necessary in order to accommodate everyone’s desires in a reasonably satisfactory fashion. So, Gaddafi will get a world of his own, populated by all the people who (for some reason) enjoy being oppressed. If an insufficient number of such people exist, the FAI will create a sufficient number of non-sentient bots to fill out the population.
The AI can do all this because, as a direct consequence of its ability to make itself smarter exponentially, it will quickly acquire quasi-godlike powers, by, er, using some kind of nanotechnology or something.
By extrapolation it seems likely that the cheapest implementation of the different-worlds-for-conflicting-points of view is some kind of virtual reality if it proves too difficult to give each human it’s own material world.
Yes, and in the degenerate case, you’d have one world per human. But I doubt it would come to that, since a). we humans really aren’t as diverse as we think, and b). many of us crave the company of other humans. In any case, the FAI will be able to instantiate as many simulations as needed, because it has the aforementioned nano-magical powers.
Indeed. It’s likely that many of the simulations would be shared.
What I find interesting to speculate on then is whether we might be either forcibly scanned into the simulation or plugged into some kind of brain-in-a-vat scenario a la the matrix.
Perhaps the putative AI might make the calculation that most humans would ultimately be OK with one of those scenarios.
Meh… as far as I’m concerned, those are just implementation details. Once your AI gets a hold of those nano-magical quantum powers, it can pretty much do anything it wants, anyway.
I understand that you don’t want to die or lose the future, and I understand the ingrained thought that UFAI = total loss, but please try to look past that, consider that you may be wrong, see that being willing to ‘eliminate’ your allies over factual disagreements loses, and cooperate in the iterated epistemic prisoner’s dilemma with your epistemic peers. You seem to be pretty obviously coming at this question from a highly emotional position, and should try to deal with that before arguing the object level.
That it’s far superior is not obvious, both because it’s not obvious how well we could reasonably expect to do without FAI (How likely would we be to successfully construct a singleton locking in our values? How efficiently could we use resources? Would the anti-AI interfere with human intelligence enhancement or uploading, either of which seems like it would destroy huge amounts of value?), and because our notional utility function might see steeply diminishing marginal returns to resources before using the entire future light cone (see this discussion).
I am, or at least was, considering the facts, including what was supplied in the links. I was also assuming for the sake of the argument that the kind of agent that the incompetent AI developers created would recursively improve to one that cooperated without communication with other universes.
Discussing the effects and implications of decisions in counterfactuals is not something that is at all emotional for me. It fascinates me. On the other hand the natural conclusion to counterfactuals (which are inevitably discussing extreme situations) is something that does seem to inspire emotional judgments, which is something that overrides my fascination.