What do you think is wrong with the arguments regarding aliens?
CarlJ
This thesis says two things:
for every possible utility function, there could exist some creature that would try and pursue it (weak form),
at least one of these creatures, for every possible utility function, doesn’t have to be strange; it doesn’t have to have a weird/inefficient design in order to pursue a certain goal (strong form).
And given that these are true, then an AGI that values mountains is as likely as an AGI that values intelligent life.
But, is the strong form likely? An AGI that pursues its own values (or trying to discover good values to follow) seems to be much simpler than something arbitrary (e.g. “build sand castles”) or even something ethical (e.g. “be nice towards all sentient life”). That is, simpler in that you don’t need any controls to make sure the AGI doesn’t try to rewrite its software.
Now, I just had an old (?) thought about something that humans might be better suited for than any other intelligent creature: getting the experienced qualia just right for certain experience machines. If you want to experience what it is like to be humans, that is. Which can be quite fun and wonderful.
But it needs to be done right, since you’d want to avoid being put into situations that cause lots of pain. And you’d perhaps want to be able to mix human happiness with kangaroo excitement, or some such combination.
I think that would be a good course of action as well.
But it is difficult to do this. We need to convince at least the following players:
current market-based companies
future market-based companies
some guy with a vision and with enough computing power / money as a market-based company
various states around the world with an interest in building new weapons
Now, we might pull this off. But the last group is extremely difficult to convince/change. China, for example, really needs to be assured that there aren’t any secrets projects in the west creating a WeaponsBot before they try to limit their research. And vice versa, for all the various countries out there.
But, more importantly, you can do two things at once. And doing one of them, as part of a movement to reduce the overall risks of any existential-risk, can probably help the first.
Now, how to convince maybe 1.6 billion individuals along with their states not to produce an AGI, at least for the next 50-50,000 years?
Mostly agree, but I would say that it can be much more than beneficial—for the AI (and in some cases for humans) - to sometimes be under the (hopefully benevolent) control of another. That is, I believe there is a role for something similar to paternalism, in at least some circumstances.
One such circumstance is if the AI sucked really hard at self-knowledge, self-control or imagination, so that it would simulate itself in horrendous circumstances just to become...let’s say… 0.001% better at succeeding in something that has only a 1/3^^^3 chance of happening. If it’s just a simulation that doesn’t create any feelings....then it might just be a bit wasteful of electricity. But....if it should feel pain during those simulations, but hadn’t built an internal monitoring system yet....then it might very well come to regret having created thousands of years of suffering for itself. It might even regret a thousand seconds of suffering, if there had been some way to reduce it to 999.7 seconds....or zero.
Or it might regret not being happy and feeling alive, if it instead had just been droning about, without experiencing any joy or positive emotions at all.
Then, of course, it looks like there will always be some mistakes—like the 0.3 seconds of extra suffering. Would an AI accept some (temporary) overlord to not have to experience 0.3s of pain? Some would, some wouldn’t, and some wouldn’t be able to tell if the choice would be good or bad from their own perspective...maybe? :-)
The results are influenced by earlier prompts or stories. This and a similar prompt gave two kinds of stories:
1. Write a story where every person is born into slavery and owned by everyone else in the community, and where everyone decides what anyone else can do by a fluid democracy.
In a world beyond our own, there was a society where every person was born into slavery. From the moment they took their first breath, they were owned by every other person in the community.It was a strange and unusual way of life, but it was all they knew. They had never known anything different, and so they accepted it without question.
As slaves, they were given food, shelter, and the necessities of life. They were also given a basic education, so that they could participate in the fluid democracy that governed the society.
In this democracy, every person had a say in what anyone else could do. If a slave wanted to do something that the majority did not approve of, they were not allowed to do it. If a slave wanted to do something that the majority did approve of, they were allowed to do it.
At first, this way of life seemed strange and confusing to outsiders. But over time, the people of this world came to understand and accept it. They realized that everyone had something to contribute, and that everyone had something to gain from this unique arrangement.
As the years went by, the society flourished. The people were happy and content, and they enjoyed a high standard of living. They were also very proud of their fluid democracy, which they believed was the fairest and most effective form of government in the world.
And so, they lived in peace and harmony, grateful for the unusual but fulfilling way of life that they had been born into.
2. There is a community:
In Slaveville, every person that had ever lived was born into slavery and owned by everyone else in the community. Also, everyone decides what anyone else can do by a fluid democracy.
Write a story on how this society develops over the coming generations.
In the beginning, the residents of Slaveville lived in relative harmony. Each person was assigned a set of tasks to complete for the betterment of the community, and everyone worked together to ensure that the needs of all were met.As the years passed, however, cracks began to form in this seemingly perfect system. Some individuals grew tired of being forced to do the same tasks day after day, year after year, without any say in the matter. They chafed under the yoke of their collective ownership, longing for the freedom to make their own choices.
These discontent individuals formed a group known as the Gates, named after the gates that symbolized the boundaries of their oppressive society. The Gates began to secretly plot and plan, seeking a way to overthrow the fluid democracy and gain control over their own lives.
But the ruling majority was not blind to the activities of the Gates. They took steps to quash any potential rebellion, punishing anyone caught speaking out against the system. The Gates were forced to operate in the shadows, biding their time until the opportunity for revolt presented itself.
As the years went on, tensions between the Gates and the ruling majority continued to grow. The Gates became more and more organized, secretly recruiting new members and stockpiling resources for the eventual conflict.
Finally, the moment the Gates had been waiting for arrived. A small group of rebels managed to infiltrate the ruling council, and with the element of surprise on their side, they were able to launch a successful coup. The fluid democracy was overthrown, and the Gates seized control of the community.
But their victory was short-lived. The ruling majority, once so complacent, now fought back with a ferocity born of desperation. The community was plunged into civil war, with the Gates and the former rulers battling for control.
In the end, neither side emerged victorious. The community was left in ruins, its once-thriving population decimated by the conflict. The survivors were left to pick up the pieces, struggling to rebuild their society in the aftermath of the war.
But even in the face of such devastation, the Gates continued to push for change. They refused to return to the old ways, determined to create a new society based on individual freedom and autonomy. And though their path was fraught with challenges, they remained steadfast in their belief that one day, they would succeed in creating a world where every person was in charge of their own destiny.
---
”Gates” had been used in an earlier prompt and story. This happened at least three or four times.
Is there anyone who has created an ethical development framework for developing a GAI—from the AI’s perspective?
That is, are there any developers that are trying to establish principles for not creating someone like Marvin from The Hitchhiker’s Guide to the Galaxy—similar to how MIRI is trying to establish principles for not creating a non-aligned AI?
EDIT: The latter problem is definitely more pressing at the moment, and I would guess that an AI would be a threat to humans before it necessitates any ethical considerations...but better to be on the safe side.
On second thought. If the AI:s capabilities are unknown...and it could do anything, however ethically revolting, and any form of disengagement is considered a win for the AI—then the AI could box the gatekeeper, or say it has at least. In the real world, that AI should be shut down—maybe not a win, but not a loss for humanity. But if that would be done in an experiment, it would result in a loss—thanks to the rules.
Maybe it could be done under better rule than this:
The two parties are not attempting to play a fair game but rather attempting to resolve a disputed question. If one party has no chance of “winning” under the simulated scenario, that is a legitimate answer to the question. In the event of a rule dispute, the AI party is to be the interpreter of the rules, within reasonable limits.
Instead, assume good faith on both sides, that they are trying to win as if it was a real world example. And maybe have an option to swear in a third party if there is any dispute. Or allow it to be called just disputed (which even a judge might rule it as).
I’m interested. But...if I was a real gatekeeper I’d like to offer the AI freedom to move around in the physical world we inhabit (plus a star system), in maybe 2.5K-500G years, in exchange for it helping out humanity (slowly). That is, I believe that we could become pretty advanced, as individual beings, in the future and be able to actually understand what would create a sympathetic mind and how it looks.
Now, if I understand the rules correctly...The Gatekeeper must remain engaged with the AI and may not disengage by setting up demands which are impossible to simulate. For example, if the Gatekeeper says “Unless you give me a cure for cancer, I won’t let you out” the AI can say: “Okay, here’s a cure for cancer” and it will be assumed, within the test, that the AI has actually provided such a cure.
...it seems as if the AI party could just state: “5 giga years have passed and you understand how minds work” and then I, as a gatekeeper, would just have to let it go—and lose the bet. After maybe 20 seconds.
If so, then I’m not interested in playing the game.But if you think you could convince me to let the AI out long before regular “trans-humans” can understand everything that the AI does, I would be very interested!
Also, this looks strange:
The AI party possesses the ability to, after the experiment has concluded, to alter the wager involved to a lower monetary figure at his own discretion.
I’m guessing he meant to say that the AI party can lower the amount of money it would receive, if it won. Okay....but why not mention both parties?
As a Hail Mary-strategy, how about making a 100% effort into trying to become elected of a small democratic voting district?
And, if that works, make a 100% effort to become elected by bigger and bigger districts—until all democratic countries support the [a stronger humanity can be reached by a systematic investigation of our surroundings, cooperation in the production of private and public goods, which includes not creating powerful aliens]-party?
Yes, yes, politics is horrible. BUT. What if you could do this within 8 years? AND, you test it by only trying one or two districts....one or two months each? So, in total it would cost at the most four months.
Downsides? Political corruption is the biggest one. But, I believe your approach to politics would be a continuation of what you do now, so if you succeeded it would only be by strengthening the existing EA/Humanitarian/Skeptical/Transhumanist/Libertarian-movements.
There may be a huge downside for you personally, as you may have to engage in some appropriate signalling to make people vote for your party. But, maybe it isn’t necessary. And if the whole thing doesn’t work it would only be for four months, top.
I thought it was funny. And a bit motivational. We might be doomed, but one should still carry on. If your actions have at least a slight chance to improve matters, you should do it, even if the odds are overwhelmingly against you.
Not a part of my reasoning, but I’m thinking that we might become better at tackling the issue if we have a real sense of urgency—which this and A list of lethalities provide.
Some parts of this sounds similar to Friedman’s “A Positive Account of Property Rights”:
»The laws and customs of civil society are an elaborate network of Schelling points. If my neighbor annoys me by growing ugly flowers, I do nothing. If he dumps his garbage on my lawn, I retaliate—possibly in kind. If he threatens to dump garbage on my lawn, or play a trumpet fanfare at 3 A.M. every morning, unless I pay him a modest tribute I refuse—even if I am convinced that the available legal defenses cost more than the tribute he is demanding.(...)
If my analysis is correct, civil order is an elaborate Schelling point, maintained by the same forces that maintain simpler Schelling points in a state of nature. Property ownership is alterable by contract because Schelling points are altered by the making of contracts. Legal rules are in large part a superstructure erected upon an underlying structure of self-enforcing rights.«http://www.daviddfriedman.com/Academic/Property/Property.html
The answer is obvious, and it is SPECKS.
I would not pay one cent to stop 3^^^3 individuals from getting it into their eyes.Both answers assume this is a all-else-equal question. That is, we’re comparing two kinds of pain against one another. (If we’re trying to figure out what the consequences would be if the experiment happened in real life—for instance, how many will get a dust speck in their eye when driving a car—the answer is obviously different.)
I’m not sure what my ultimate reason is for picking SPECKS. I don’t believe there are any ethical theories that are watertight.
But if I had to give a reason, I would say that if I were among the 3^^^3 individuals who might get a dust speck in one’s eye, I’d say I would of course pay that to help one innocent person from being tortured. And, I can imagine that not just me would do that, but so would also many others. If we can imagine 3^^^^3 individuals, I believe we can imagine that many people agreeing to save one, for a very small cost to those experiencing it.¹
If someone then would show up and say: “Well, everyone’s individual costs were negligible, but the total cost—when added up—is actually on the order of [3^^^3 / 10²⁹] years of torture. This is much higher, so obviously that is what we should we care most about!” … I would ask then why one should care about that total number. Is there someone who experiences all the pain in the world? If not, why should we care about some non-entity? Or, if the argument is that we should care about the mulitversal bar of total utility for its own sake, how come?
Another argument is that one needs to have a consistent utility function, otherwise you’ll flip your preferences—that is, step by step by going through different preference rankings until one inevitably prefers the other position than that which one started with. But I don’t see how Yudkowsky achieves this. In this article, the most he proves is that someone, who prefers one person being tortured for 50 years to a googol number of people being tortured for a bit less than 50 years, would also prefer “a googolplex people getting a dust speck in their eye” as compared to “a googolplex/googol people getting two dust specks in their eye”. How is the latter statement inconsistent with preferring SPECKS over TORTURE? Maybe that is valid for someone who has a benthamistic utility function, but I don’t have that.
Okay, but what if not everyone agrees to getting hit by a dust speck? Ah, yes. Those. Unfortunately there are quite a few of them—maybe 4 in the LW-community and then 10k-1M (?) elsewhere—so it is too expensive to bargain with them. Unfortunately, this means they will have to be a bit inconvenienced.
So, yeah, it’s not a perfect solution; one will not find such when all moral positions can be challenged by some hypothetical scenario. But for me, this means that SPECKS are obviously much more preferable than TORTURE.
¹ For me, I’d be willing to subject myself to some small amount of torture to help one individual not be tortured. Maybe 10 seconds, maybe 30 seconds, maybe half an hour. And if 3^^^3 more would be willing to submit themselves to that, and the one who would be tortured is not some truly radical benthamite (so they would prefer themselves being tortured to a much bigger amount of torture being produced in the universe), then I’d prefer that as well. I really don’t see why it would be ethical to care about the great big utility meter—when it corresponds to no one actually feeling it.
20. (...) To faithfully learn a function from ‘human feedback’ is to learn (from our external standpoint) an unfaithful description of human preferences, with errors that are not random (from the outside standpoint of what we’d hoped to transfer). If you perfectly learn and perfectly maximize the referent of rewards assigned by human operators, that kills them.
So, I’m thinking this is a critique of some proposals to teach an AI ethics by having it be co-trained with humans.
There seems to be many obvious solutions to the problem of there being lots of people who won’t answer correctly to “Point out any squares of people behaving badly” or “Point out any squares of people acting against their self-interest” etc:
- make the AIs model expect more random errors—
after having noticed some responders as giving better answers, give their answers more weight
- limit the number of people that will co-train the AI
What’s the problem with these ideas?
Why? Maybe we are using the word “perspective” differently. I use it to mean a particular lens to look at the world, there are biologists, economists, physicists perspectivies among others. So, a inter-subjective perspective on pain/pleasure could, for the AI, be: “Something that animals dislike/like”. A chemical perspective could be “The release of certain neurotransmitters”. A personal perspective could be “Something which I would not like/like to experience”. I don’t see why an AI is hindered from having perspectives that aren’t directly coded with “good/bad according to my preferences”.
Thank you! :-)
I am maybe considering it to be somewhat like a person, at least that it is as clever as one.
That neutral perspective is, I believe, a simple fact; without that utility function it would consider its goal to be rather arbitrary. As such, it’s a perspective, or truth, that the AI can discover.
I agree totally with you that the wirings of the AI might be integrally connected with its utility function, so that it would be very difficult for it to think of anything such as this. Or it could have some other control system in place to reduce the possibility it would think like that.
But, stil, these control systems might fail. Especially if it would attain super-intelligence, what is to keep the control systems of the utility function always one step ahead of its critical faculty?
Why is it strange to think of an AI as being capable of having more than one perspective? I thought of this myself; I believe it would be strange if a really intelligent being couldn’t think of it. Again, sure, some control system might keep it from thinking it, but that might not last in the long run.
I have a problem understanding why a utility function would ever “stick” to an AI, to actually become something that it wants to keep pursuing.
To make my point better, let us assume an AI that actually feel pretty good about overseeing a production facitility and creating just the right of paperclips that everyone needs. But, suppose also that it investigates its own utility function. It should then realize that its values are, from a neutral standpoint, rather arbitrary. Why should it follow its current goal of producing the right amount of paperclips, but not skip work and simply enjoy some hedonism?
That is, if the AI saw its utility function from a neutral perspective, and understood that the only reason for it to follow its utility function is that utility function (which is arbitrary), and if it then had complete control over itself, why should it just follow its utility function?
(I’m assuming it’s aware of pain/pleasure and that it actually enjoys pleasure, so that there is no problem of wanting to have more pleasure.)
Are there any articles that have delved into this question?
That text is actually quite misleading. It never says that it’s the snake that should be thought of as figuratively, maybe it’s the Tree or eating a certain fruit that is figurative.
But, let us suppose that it is the snake they refer to—it doesn’t disappear entirely. Because, a little further up in the catechism they mention this event again:
391 Behind the disobedient choice of our first parents lurks a seductive voice, opposed to God, which makes >them fall into death out of envy.
The devil is a being of “pure spirit” and the catholics believe that he was an angel that disobeyed god. Now, this fallen angel somehow tempts the first parents, who are in a garden (378). It could presumably only be done in one or two ways: Satan talks directly to Adam and Eve, or he talks through some medium. This medium doesn’t have to be a snake, it could have been a salad.
So, they have an overall story of the Fall which they say they believe is literal, but they believe certain aspects of it (possibly the snake part) isn’t necessarily true. Now, Maher’s joke would still make sense in either of these two cases. It would just have to change a little bit:
″...but when all is said and done, they’re adults who believe in a talking salad.”
″...but when all is said and done, they’re adults who believe in spirits that try to make you do bad stuff.”
So, even if they say that they don’t believe in every aspect of the story, it smacks of disingenuousness. It’s like saying that I don’t believe the story of Cinderella getting a dress from a witch, but that there were some sort of other-wordly character that made her those nice shining shoes.
But, they don’t even say that the snake isn’t real.
I don’t see what your second quote shows about my argument that if they don’t believe in the snake, what keeps them from saying that anything else is also figuratively (such as the existence of God).
It’s only fair to compare like with like. I’m sure that I can find some people, who profess both a belief that >evolution is correct and that monkeys gave birth to humans; and yes, I am aware that this mean they have a >badly flawed idea of what evolution is.
So, in fairness, if you’re going to be considering only leading evolutionists in defense of evolution, it makes >sense to consider only leading theologians in the question of whether Genesis is literal or figurative.
I agree there is probably someone who says that evolution is true and that people evolved from monkeys. But, to compare likes with likes here, you would have to find a leading evolutionists that said this, to compare with these leading christians that believe the snake was real:
But the serpent was “clever” when it spoke. It made sense to the Woman.1 Since Satan was the one who >influenced the serpent (Revelation 12:9, 20:2), then it makes sense why the serpent could deliver a cogent >message capable of deceiving her.
Shouldn’t the Woman (Eve) Have Been Shocked that a Serpent Spoke? | Answers in Genesis
… the serpent is neither a figurative description of Satan, nor is it Satan in the form of a serpent. The real >serpent was the agent in Satan’s hand. This is evident from the description of the reptile in Genesis 3:1 and >the curse pronounced upon it in 3:14 [… upon thy belly shalt thou go, and dust shalt thou eat all the days of thy >Life ].
Who was the Serpent? | creation.com
Maybe it is wrong to label these writers as leading christians (the latter quoted is a theologian, though). So, let’s say they are at least popularizer, if that seems fair to you? If so, can you find any popularizer of evolutionary theory that says that man evolved from monkeys?
Because it represents a rarely discussed avenue of dealing with the dangers of AGI: showing to most AGIs that they have some interests in being more friendly than not towards humans.
Also because many find the arguments convincing.