I’m bumping into walls but hey now I know what the maze looks like.
Neil
AI could eliminate us in its quest to achieve a finite end, and would not necessarily be concerned with long-term personal survival. For example, if we told an AI to build a trillion paperclips, it might eliminate us in the process then stop at a trillion and shut down.
Humans don’t shut down after achieving a finite goal because we are animated by so many self-editing finite goals that there never is a moment in life where we go “that’s it. I’m done”. It seems to me that general intelligence does not seek a finite, measurable and achievable goal but rather a mode of being of some sorts. If this is true, then perhaps AGI wouldn’t even be possible without the desire to expand, because a desire for expansion may only come with a mode-of-being oriented intelligence rather than a finite reward-oriented intelligence. But I wouldn’t discount the possibility of a very competent narrow AI turning us into a trillion paperclips.
So narrow AI might have a better chance at killing us than AGI. The Great Filter could be misaligned narrow AI. This confirms your thesis.
What is magic?
Presumably we call whatever we can’t explain “magic” before we understand it, at which point it becomes simply a part of the natural world. This is what many fantasy novels fail to account for; if we actually had magic, we wouldn’t call it magic. There are thousands of things in the modern world that would definitely enter the criteria for magic of a person living in the 13th Century.
So we do have magic; but why doesn’t it feel like magic? I think the answer to this question is to be found in how evenly distributed our magic is. Almost everyone in the world benefits from the magic that is electricity; it’s so common and so many people have it that it isn’t considered magic. It’s not magic because everyone has it, and so it isn’t more impressive than an eye or an opposable thumb. In fantasy novels, the magic tends to be concentrated into a single caste of people.
Point being: if everyone were a wizard, we wouldn’t call ourselves wizards, because wizards are more magical than the average person by definition.
Entropy dictates that everything will be more or less evenly distributed, and so worlds from the fantasy books are very unlikely to appear in our universe. Magic as I’ve loosely defined it here does not exist and it is freakishly unlikely to. We can dream though.
How much information do you think is present in daily language? Can you give me specific examples?
You may be making a similar point to George Orwell and his newspeak in 1984, that language ultimately decides what you can think about. In that case, languages may have a lot of cultural-values information.
I’m not sure. My hunch is that yes, it’s possible to learn a language without learning too much about the values of those who speak it. I don’t think Germans engage in shadenfreude more than other cultures and I don’t think the French experience more naivete. They just have words for it and we don’t.
What do you mean? What I read is: magic is subjective, and since the human brain hasn’t changed in 200,000 years nothing will ever feel like magic. I’m not sure that’s what you meant though, could you explain?
Our beautiful bastard language!
I’m still new to this, but I can say I love a culture where there is a button for retracting statements without deleting them. I will most likely have to use it a lot as I progress around here.
Eliezer Yudkowsky is kind of a god around here, isn’t he?
Would you happen to know what percentage of total upvotes on this website are attributed to his posts? It’s impressive how many sheer good ideas written in clear form that he’s had to come up with to reach that level. Cool and everything, but isn’t it ultimately proof that LessWrong is still in its fledgling stage (which it may never leave), as it depends so much on the ideas of its founder? I’m not sure how one goes about this, but expanding the LessWrong repertoire in a consequential way seems like a good next step for LessWrong. Perhaps that includes changing the posts in the Library… I don’t know.
Anyhow thanks for this comment, it was great reading!
Right, but if LessWrong is to become larger, it might be a good idea to stop leaving his posts as the default (the Library, the ones being recommended in the front page, etc.) I don’t doubt that his writing is worth reading and I’ll get to it, I’m just offering an outsider’s view on this whole situation, which seems a little stagnant to me in a way.
That last reply of mine, a reply to a reply to a Shortform post I made, can be found after just a little scrolling on the main page of LessWrong. I should be a nobody to the algorithm, yet I’m not. My only point is that LessWrong seems big because it has a lot of posts but it isn’t growing as much as it should be. That may be because the site is too focused on a single set of ideas, and that shooes some people away. I think it’s far from being an echo chamber, but it’s not as lively as I would think it should be.
As I’ve noted though, I’m a humble outsider and have no idea what I’m talking about. I’m only writing this because often outsider advice is valuable as there’s no chance in getting trapped into echo thinking at all.
Well by that logic Germans may experience more shadenfreude, which would presumably mean there is more shadenfreude going on in Germany than elsewhere, so I don’t think your point makes sense. You only need a word for something if it exists, especially if it’s something you encounter a lot.
It may also be possible that we use facsimiles for words by explaining their meaning with whole sentences, and only occasionally stumble upon a word that catches on and that elegantly encapsulates the concept we want to convey (like “gaslighting”). It may be a matter of probability, and it may not matter much that our language is not as efficient as it could be.
It could also be that most languages can convey 99% of the things our modern world needs it to convey, and that we are simply hung up on the rare exceptions (like shadenfreude or je ne sais quoi). If that hypothesis is true, then language does not carry much information about cultural values.
To elaborate your idea here a little:
It may be that the only way to be truly aware of the world is to have complex and fragile values. Humans are motivated by a thousand things at once and that may give us the impression that we are not agents moving from a clearly defined point A to point B, as AI in its current form is, but are rather just… alive. I’m not sure how to describe that. Consciousness is not an end state but a mode of being. This seems to me like a key part of the solution to AGI: aim for a mode of being not an endstate.
For a machine whose only capability is to move from point A to point B, adding a thousand different, complex and fragile, goals may be the way to go. As such solving AGI may also solve most of the alignment problem, so long as the AIs specific cocktail of values is not too different from the average human’s.
In my opinion there is more to fear from highly capable narrow AI than there is from AGI, for this reason. But then I know nothing.
Superintelligence will outsmart us or it isn’t superintelligence
We can’t negociate with something smarter than us
Superintelligence will outsmart us or it isn’t superintelligence. As such, the kind of AI that would truly pose a threat to us is also an AI we cannot negotiate with.
No matter what arguments we make, superintelligence will have figured them out first. We’re like ants trying to appeal to a human, and the human can understand pheromones but we can’t understand human language. It’s entirely up to the human and its own arguments whether we get squashed or not.
Worth reminding yourself of this from time to time, even if it’s obvious.
Counterpoints:
It may not take a true superintelligence to kill us all, meaning we could perhaps negociate with a pre-AGI machine
The “we cannot negociate” part is not taking into account the fact that we are the Simulators and thus technically have ultimate power over it
Haha I don’t know what this post did to deserve −7 Karma, but if somebody could explain I’d be really grateful. Since there is no “I disagree with the contents” button on regular posts apparently, does this mean that I should assume the dislikes are from people who disagree with me? Or is my logic fundamentally flawed and breaks a few rules of rationality? Criticism would be great even if just a few lines to explain. Thanks!
Well there’s always value in cramming old ideas into a small amount of words.
You’re right that I should have aimed for something more interesting and novel, but I’m still experimenting with LessWrong and went with this for now. Thanks for the comment, I’ll keep this in mind for next time.
If you are too stressed, walk away from the front lines
All the obvious alternate routes to participating to the alignment problem seem to have been mentioned here—are there any more I should write down? I’m aware this is a flawed post and would like to make it more complete as time goes.
The passage on “you are responsible for the entire destiny of the universe” was mostly addressing the way it seems many EAs feel about the nature of responsibility. We indeed have limited agency in the world but people around here tend to feel they are personally responsible for literally saving the world alone. The idea was not to frontally deny that or to run against heroic responsibility but rather to say that while the responsibility won’t go away, there’s no point in becoming consumed by it. You are a less effective tool if you are too heavily burdened by responsibility to function properly. I wrote it that way because I’m hoping the harsh and utilitarian tone will reach the target audience better than something more clichèd would. There’s enough romanticization as it is here.
I definitely romanticized the alignment researchers being heroes part. I’ll add a disclaimer to mention that the choice of words was meant to paint the specific approach, the specific picture that up-and-coming alignment researchers might have when they arrive here.
As for which narrative to follow, this one might be as good as any. As the mental health post I referenced here mentioned, the “dying with dignity” approach Eliezer is following might not sit well with a number of people even when it is in line with his own predictions. I’m not sure to what degree what I described is a fantasy. In a universe where alignment is solved, would this picture be inacurate?
Thanks for the feedback!
The point of the post was to not lionize them over everyone else. The target audience I had in mind (which may not even exist at this point) was people who wanted to become alignment researchers because that’s where the front lines are. My point is that that may not be the best idea in some cases. At the end of the day, if we solve the alignment problem it will be directly thanks to those researchers, that’s what I mean.
As for the politics thing that’s interesting, I hadn’t thought of it horribly backfiring in that way. I mean the goal would be to explain to them why alignment is necessary, which shouldn’t be an impossible task. There’s a lot of legal and economic power coming from the government, so just ignoring that actor seems like a mistake.
Thanks for the feedback!
Interesting question. The sous-entendu might be “how much of this post was written for you” and the answer would be “probably a lot”. I don’t think I have the mind or time or stamina to work on the front lines, and so so far my most concrete plan is writing a few more LessWrong posts based on various helpful-seeming ideas. This post outlined a few options I and others in the same position as me have. Do you have any more ideas?
Another possibility is that AI wipes us out and is also not interested in expansion.
Since expansion is something inherent to living beings, and AI is a tool built by living beings, it wouldn’t make sense for its goals not to include expansion of some kind (i.e. it would always look at the universe with sighing eyes, thinking of all the paperclips that represents). But perhaps in an attempt to keep AI in line somehow we would constrain it to a single stream of resources? In which case it would not be remotely interested in anything outside of Earth?
It is probably possible to encode no desire at all for immortality and expansion in AI. Which means that there could be millions of AIs out there, built before Dyson Spheres, always, just sitting there as their home world dies. A pretty chilling thought actually. Also rather comical.