Even more importantly that small proportion of FAIs that do exist are not friendly to things I care about.
That FAI is good for you is a property of the term “FAI”. If it doesn’t create value for you, it’s not FAI, but something like Smileys and Paperclippers, potential trade partner but not your guy. Let’s keep it simple.
That FAI is good for you is a property of the term “FAI”.
“Friendly to their Creator AI”, choose an acronym. Perhaps FAI. Across the multiverse most civilizations that engage in successful AI efforts will produce an AI that is not friendly to me. AIs that are actually FAIs (which include by definition my own survival) are negligible.
Let’s keep it simple.
Releasing a Smiley will make me die and destroy everything I care about. I will kill anyone who stops me preventing that disaster. That is as simple as I can make it.
AIs that are actually FAIs (which include by definition my own survival) are negligible.
Formal preference is a more abstract concept than survival in particular, and even though all else equal, in usual situations, survival is preferable to non-survival, there could be situations even better than “survival”. It’s not out of the question “by definition” (you know better than to invoke this argument pattern).
Formal preference is one particular thing. You can’t specify additional details without changing the concept. If preference says that “survival” is a necessary component, that worlds without “survival” are equally worthless, then so be it. But it could say otherwise. You can’t study something and already know the answer, you can’t just assume to know that this property that intuitively appeals to you is unquestionably present. How do you know? I’d rather build on clear foundation, and remain in doubt about what I can’t yet see.
That FAI is good for you is a property of the term “FAI”.
“Friendly to their Creator AI”, choose an acronym. Perhaps FAI. Across the multiverse most civilizations that engage in successful AI efforts will produce an AI that is not friendly to me. AIs that are actually FAIs (which include by definition my own survival) are negligible.
Negligible, non-negligible, that’s what the word means. It talks about specifically working for your preference, because of what AI values and not because it needs to do so for trade. FAI could be impossible, for example, that doesn’t change the concept. BabyEater’s AI could be an UFAI, or it could be a FAI, depending on how well it serves your preference. It could turn out to be a FAI, if the sympathy aspect of their preference is strong enough to dole you a fair part of the world, more than you own by pure game-theoretic control.
FAI doesn’t imply full control given to your preference (for example, here on Earth we have many people with at least somewhat different preferences, and all control likely won’t be given to any single person). The term distinguishes AIs that optimize for you because of their own preference (and thus generate powerful control in the mathematical universe for your values, to a much more significant extent than you can do yourself), from AIs that optimize for you because of control pressure (in other terms, trade opportunity) from another AI (which is the case for “UFAI”).
(I’m trying to factor the discussion into the more independent topics to not lose track of the structure of the argument.)
Releasing a Smiley will make me die and destroy everything I care about. I will kill anyone who stops me preventing that disaster. That is as simple as I can make it.
Please don’t derail a civilized course of discussion, this makes clear communication more expensive in effort. This particular point was about a convention for using a word, and not about that other point you started talking sarcastically about here.
Also, speculating on the consequences of a conclusion (like the implication from it being correct to not release the UFAI, to you therefore having to destroy everything that stands in the way of preventing that event, an implication with which I more or less agree, if you don’t forget to take into account the moral value of said murders) is not helpful in the course of arguing about which conclusion is the correct one.
This particular point was about a convention for using a word,
I engaged with your point and even accepted it.
and not about that other point you started talking sarcastically about here.
I reject your labeling attempt. My point is a fundamental disagreement with an important claim you are making and in no way sarcastic. Your comments here are attempting to redirect emphasis away from the point by re framing my disagreement negatively while completely ignoring my engagement with and acceptance of your point.
I also do not understand the “Let’s keep it simple” rhetoric. My misuse of the ‘FAI’ algorithm was oversimplifying for the purposes of brevity and I was willing to accept your more rigorous usage even though it requires more complexity.
Also, speculating on the consequences of a conclusion (like the implication from it being correct to not release the UFAI, to you therefore having to destroy everything that stands in the way of preventing that event, an implication with which I more or less agree, if you don’t forget to take into account the moral value of said murders) is not helpful in the course of arguing about which conclusion is the correct one.
I have previously discussed the benefits of the ‘kill test’ in considering moral choices when things really matter. This is one of the reasons Eliezer’s fan-fiction is so valuable. In particular I am influenced by Three Worlds Collide and The Sword of Good. I find that it is only that sort of stark consideration that can overcome certain biases that arise from moral squeamishness that is not evolved to handle big decisions. The “Ugh” reaction to things that “violate people’s rights” and to coercement bias us towards justifying courses of action so we don’t have to consider being ‘bad’. I may write a top level post on the subject (but there are dozens above it on my list).
This conversation is not one that I will enjoy continuing. I do not believe I am likely to make you update and nor do I expect to elicit new arguments that have not been considered. If something new comes along or if a top level post is created to consider the issues then I would be interested to read them and would quite probably re-engage.
I reject your labeling attempt. My point is a fundamental disagreement with an important claim you are making and in no way sarcastic. Your comments here are attempting to redirect emphasis away from the point by re framing my disagreement negatively while completely ignoring my engagement with and acceptance of your point.
I also do not understand the “Let’s keep it simple” rhetoric. My misuse of the ‘FAI’ algorithm was oversimplifying for the purposes of brevity and I was willing to accept your more rigorous usage even though it requires more complexity.
Okay, misunderstanding on both sides. From what I understood, there is no point in working on reaching agreement on this particular point of meta and rhetoric. (More substantial reply to the point we argue and attempt to reframe it for clarity are in the other two comments, which I assume you didn’t notice at the time of writing this reply.)
I have previously discussed the benefits of the ‘kill test’ in considering moral choices when things really matter.
Could you restate that (together with what you see as the disagreement, and the way “kill test” applies to this argument)? From what I remember, it’s a reference to intuitive conclusion: you resolve the moral disagreement on the side of what you actually believe to be right. It’s not a universally valid path to figuring out what’s actually right, intuitions are sometimes wrong (although it might be the only thing to go on when you need to actually make that decision, but it’s still decision-making under uncertainty, a process generally unrelated to truth-seeking).
Okay, misunderstanding on both sides. From what I understood, there is no point in working on reaching agreement on this particular point of meta and rhetoric. (More substantial reply to the point we argue and attempt to reframe it for clarity are in the other two comments, which I assume you didn’t notice at the time of writing this reply.)
Ok. And yes, I hadn’t seen the other comments (either not yet written or hidden among the other subjects in my inbox).
Sadly Vladimir this failure to understand stakeholder theory is endemic in AI discussions. Friendly AI cannot possibly be defined as being “if it doesn’t create value for you it’s not FAI” because value is arbitrary. There are some people who want to die and others to want to live being the stark example. Everyone being killed is thus value for some and not value for others and vice versa.
What we end up with is having to define friendly as being “creating value for the largest possible number of human stakeholders even if some of them lose”.
For example, someone who derives value from ordering people around or having everyone else be their personal slaves such as Caligula or the ex-dictator Gaddafi doesn’t (didn’t....) see value in self-rule for the people and thus fought hard to maintain the status quo, murdering many people in the process.
In any scenario whereby you consider the wants of those who seek most of the world’s resources or domination over others, you’re going to end up with an impossible conundrum for any putative FAI.
So given that scenario, what is really in all of our best interests if some of us aren’t going to get what we want and there is only one Earth?
So given that scenario, what is really in all of our best interests if some of us aren’t going to get what we want and there is only one Earth?
One answer I’ve seen is that the AI will create as many worlds as necessary in order to accommodate everyone’s desires in a reasonably satisfactory fashion. So, Gaddafi will get a world of his own, populated by all the people who (for some reason) enjoy being oppressed. If an insufficient number of such people exist, the FAI will create a sufficient number of non-sentient bots to fill out the population.
The AI can do all this because, as a direct consequence of its ability to make itself smarter exponentially, it will quickly acquire quasi-godlike powers, by, er, using some kind of nanotechnology or something.
By extrapolation it seems likely that the cheapest implementation of the different-worlds-for-conflicting-points of view is some kind of virtual reality if it proves too difficult to give each human it’s own material world.
Yes, and in the degenerate case, you’d have one world per human. But I doubt it would come to that, since a). we humans really aren’t as diverse as we think, and b). many of us crave the company of other humans. In any case, the FAI will be able to instantiate as many simulations as needed, because it has the aforementioned nano-magical powers.
Indeed. It’s likely that many of the simulations would be shared.
What I find interesting to speculate on then is whether we might be either forcibly scanned into the simulation or plugged into some kind of brain-in-a-vat scenario a la the matrix.
Perhaps the putative AI might make the calculation that most humans would ultimately be OK with one of those scenarios.
What I find interesting to speculate on then is whether we might be either forcibly scanned into the simulation or plugged into some kind of brain-in-a-vat scenario a la the matrix.
Meh… as far as I’m concerned, those are just implementation details. Once your AI gets a hold of those nano-magical quantum powers, it can pretty much do anything it wants, anyway.
That FAI is good for you is a property of the term “FAI”. If it doesn’t create value for you, it’s not FAI, but something like Smileys and Paperclippers, potential trade partner but not your guy. Let’s keep it simple.
“Friendly to their Creator AI”, choose an acronym. Perhaps FAI. Across the multiverse most civilizations that engage in successful AI efforts will produce an AI that is not friendly to me. AIs that are actually FAIs (which include by definition my own survival) are negligible.
Releasing a Smiley will make me die and destroy everything I care about. I will kill anyone who stops me preventing that disaster. That is as simple as I can make it.
Formal preference is a more abstract concept than survival in particular, and even though all else equal, in usual situations, survival is preferable to non-survival, there could be situations even better than “survival”. It’s not out of the question “by definition” (you know better than to invoke this argument pattern).
Formal preference is one particular thing. You can’t specify additional details without changing the concept. If preference says that “survival” is a necessary component, that worlds without “survival” are equally worthless, then so be it. But it could say otherwise. You can’t study something and already know the answer, you can’t just assume to know that this property that intuitively appeals to you is unquestionably present. How do you know? I’d rather build on clear foundation, and remain in doubt about what I can’t yet see.
Negligible, non-negligible, that’s what the word means. It talks about specifically working for your preference, because of what AI values and not because it needs to do so for trade. FAI could be impossible, for example, that doesn’t change the concept. BabyEater’s AI could be an UFAI, or it could be a FAI, depending on how well it serves your preference. It could turn out to be a FAI, if the sympathy aspect of their preference is strong enough to dole you a fair part of the world, more than you own by pure game-theoretic control.
FAI doesn’t imply full control given to your preference (for example, here on Earth we have many people with at least somewhat different preferences, and all control likely won’t be given to any single person). The term distinguishes AIs that optimize for you because of their own preference (and thus generate powerful control in the mathematical universe for your values, to a much more significant extent than you can do yourself), from AIs that optimize for you because of control pressure (in other terms, trade opportunity) from another AI (which is the case for “UFAI”).
(I’m trying to factor the discussion into the more independent topics to not lose track of the structure of the argument.)
Please don’t derail a civilized course of discussion, this makes clear communication more expensive in effort. This particular point was about a convention for using a word, and not about that other point you started talking sarcastically about here.
Also, speculating on the consequences of a conclusion (like the implication from it being correct to not release the UFAI, to you therefore having to destroy everything that stands in the way of preventing that event, an implication with which I more or less agree, if you don’t forget to take into account the moral value of said murders) is not helpful in the course of arguing about which conclusion is the correct one.
I engaged with your point and even accepted it.
I reject your labeling attempt. My point is a fundamental disagreement with an important claim you are making and in no way sarcastic. Your comments here are attempting to redirect emphasis away from the point by re framing my disagreement negatively while completely ignoring my engagement with and acceptance of your point.
I also do not understand the “Let’s keep it simple” rhetoric. My misuse of the ‘FAI’ algorithm was oversimplifying for the purposes of brevity and I was willing to accept your more rigorous usage even though it requires more complexity.
I have previously discussed the benefits of the ‘kill test’ in considering moral choices when things really matter. This is one of the reasons Eliezer’s fan-fiction is so valuable. In particular I am influenced by Three Worlds Collide and The Sword of Good. I find that it is only that sort of stark consideration that can overcome certain biases that arise from moral squeamishness that is not evolved to handle big decisions. The “Ugh” reaction to things that “violate people’s rights” and to coercement bias us towards justifying courses of action so we don’t have to consider being ‘bad’. I may write a top level post on the subject (but there are dozens above it on my list).
This conversation is not one that I will enjoy continuing. I do not believe I am likely to make you update and nor do I expect to elicit new arguments that have not been considered. If something new comes along or if a top level post is created to consider the issues then I would be interested to read them and would quite probably re-engage.
Okay, misunderstanding on both sides. From what I understood, there is no point in working on reaching agreement on this particular point of meta and rhetoric. (More substantial reply to the point we argue and attempt to reframe it for clarity are in the other two comments, which I assume you didn’t notice at the time of writing this reply.)
Could you restate that (together with what you see as the disagreement, and the way “kill test” applies to this argument)? From what I remember, it’s a reference to intuitive conclusion: you resolve the moral disagreement on the side of what you actually believe to be right. It’s not a universally valid path to figuring out what’s actually right, intuitions are sometimes wrong (although it might be the only thing to go on when you need to actually make that decision, but it’s still decision-making under uncertainty, a process generally unrelated to truth-seeking).
Ok. And yes, I hadn’t seen the other comments (either not yet written or hidden among the other subjects in my inbox).
Sadly Vladimir this failure to understand stakeholder theory is endemic in AI discussions. Friendly AI cannot possibly be defined as being “if it doesn’t create value for you it’s not FAI” because value is arbitrary. There are some people who want to die and others to want to live being the stark example. Everyone being killed is thus value for some and not value for others and vice versa.
What we end up with is having to define friendly as being “creating value for the largest possible number of human stakeholders even if some of them lose”.
For example, someone who derives value from ordering people around or having everyone else be their personal slaves such as Caligula or the ex-dictator Gaddafi doesn’t (didn’t....) see value in self-rule for the people and thus fought hard to maintain the status quo, murdering many people in the process.
In any scenario whereby you consider the wants of those who seek most of the world’s resources or domination over others, you’re going to end up with an impossible conundrum for any putative FAI.
So given that scenario, what is really in all of our best interests if some of us aren’t going to get what we want and there is only one Earth?
One answer I’ve seen is that the AI will create as many worlds as necessary in order to accommodate everyone’s desires in a reasonably satisfactory fashion. So, Gaddafi will get a world of his own, populated by all the people who (for some reason) enjoy being oppressed. If an insufficient number of such people exist, the FAI will create a sufficient number of non-sentient bots to fill out the population.
The AI can do all this because, as a direct consequence of its ability to make itself smarter exponentially, it will quickly acquire quasi-godlike powers, by, er, using some kind of nanotechnology or something.
By extrapolation it seems likely that the cheapest implementation of the different-worlds-for-conflicting-points of view is some kind of virtual reality if it proves too difficult to give each human it’s own material world.
Yes, and in the degenerate case, you’d have one world per human. But I doubt it would come to that, since a). we humans really aren’t as diverse as we think, and b). many of us crave the company of other humans. In any case, the FAI will be able to instantiate as many simulations as needed, because it has the aforementioned nano-magical powers.
Indeed. It’s likely that many of the simulations would be shared.
What I find interesting to speculate on then is whether we might be either forcibly scanned into the simulation or plugged into some kind of brain-in-a-vat scenario a la the matrix.
Perhaps the putative AI might make the calculation that most humans would ultimately be OK with one of those scenarios.
Meh… as far as I’m concerned, those are just implementation details. Once your AI gets a hold of those nano-magical quantum powers, it can pretty much do anything it wants, anyway.