Agent cares about his goals, and ignores the objective norms.
Viliam
If you know what you must do
There is no “must”, there is only “should”. And even that only assuming that there is an objective norm—otherwise there is even no “should”, only want.
Again, Satan in Christianity. Knows what is “right”, does the opposite, and does it effectively. The intelligence is used to achieve his goals, regardless of what is “right”.
Intelligence means being able to figure out how to achieve what one wants. Not what one “should” want.
Imagine that somehow science proves that the goal of this universe is to produce as many paperclips as possible. Would you feel compelled to start producing paperclips? Or would you keep doing whatever you want, and let the universe worry about its goals? (Unless there is some kind of God who rewards you for the paperclips produced and punishes if you miss the quota. But even then, you are doing it for the rewards, not for the paperclips themselves.)
The AGI (or a human) can ignore the threats… and perhaps perish as a consequence.
General intelligence does not mean never making a strategic mistake. Also, maybe from the value perspective of the AGI, doing whatever it was doing now could be more important than surviving.
Makes sense, but wouldn’t this also result in even fewer replications (as a side effect of doing less superfluous work)?
I was going to ask for interesting examples. But perhaps we can do even better and choose examples with the highest value of… uhm… something.
I am just wildly guessing here, but it seems to me that if these features are somehow implied by the human text, the ones that are “implied most strongly” could be the most interesting ones. Unless they are just random artifacts of the process of learning.
If we trained the LLM using the same text database, but randomly arranged the sources, or otherwise introduced some noise, would the same concepts appear?
Are you perhaps using “intelligence” as an applause light here?
To use a fictional example, is Satan (in Christianity) intelligent? He knows what is the right thing to do… and chooses to do the opposite. Because that’s what he wants to do.
(I don’t know Vatican’s official position on Satan’s IQ, but he is reportedly capable of fooling even very smart people, so I assume he must be quite smart, too.)
In terms of artificial intelligence, if you have a super-intelligent program that can provide answers to various kinds of questions, for any goal G you can create a robot that calls the super-intelligent program to figure out what actions are most likely to achieve G, and then performs those actions. Nothing in the laws of physics prevents this.
Orthogonality thesis is not about the existence or nonexistence of “objective norms/values”, but whether a specific agent could have a specific goal. The thesis says that for any specific goal, there can be an intelligent agent that has the goal.
To simplify it, the question is not “is there an objective definition of good?” where we probably disagree, but rather “can an agent be bad?” where I suppose we both agree the answer is clearly yes.
More precisely, “can a very intelligent agent be bad?”. Still, the answer is yes. (Even if there is such thing as “objective norms/values”, the agent can simply choose to ignore them.)
The first two points… I wonder what is the relation between “prestigious university” and “quality of your peers”. Seems like it should be positively correlated, but maybe there is some caveat about the quality not being one-dimensional, like maybe rich people go to university X, but technically skilled people to university Y.
The third point, I’d say be aware of the distinction between the things you care about, and the things you have to do for bureaucratic reasons. There may or may not be an overlap between the former and the school lessons.
The fourth and seventh points are basically: some people give bad advice; and for anything you could possibly do, someone will find a rationalization why that specific thing is important (if everything else fails, they can say it makes you more “well-rounded”). But “skills that develop value” does not say how to choose e.g. between a smaller value now or a greater value in future.
The fifth point—depends on what kind of job/mentor you get. It could be much better or much worse that school, and it may be difficult to see the difference; there are many overconfident people giving wrong advice in the industry, too.
The sixth point—clearly, getting fired is not an optimal outcome; if you do not need to complete the school, what are you even doing there?
I think we probably agree on how far the existing system is from the ideal. I wanted to point at the opposite end of the scale as a reminder that we are even further away from that.
When I was at the first grade of elementary school, they tried to teach us about “sets”, which mostly meant that instead of “two plus two equals four” the textbook said “the union of a set containing two elements and another set containing two elements, has four elements”. In hindsight I see this was a cargo-cultish version of set theory, which probably was very high-status at that time. I also see that from the perspective of set theory as the set theorists know it, this was quite useless. Yes, we used the word “set” a lot, but it had little in common with how the set theorists think about sets. Anyway, we have learned addition and subtraction successfully, albeit with some extra verbal friction.
Compared to that, when I tried to learn something in my free time as a teenager, people around me recommended me to read books written by Däniken, the Silva method of mind control, Moody’s Life after Life, religious literature, books on meditation, and other shit. I have spent a lot of time practicing “altered states of consciousness”, because (from the perspective of a naive teenager who believed that the adults around him are not utter retards, and the people they consider high-status are not all lying scumbags) it seemed like a very efficient intervention. I mean, if you get the supernatural skills first, they will give you a huge multiplier to everything you try doing later, right? Haha, nope.
So while I hate school with a passion, as many people on Less Wrong do, the alternative seems much worse. Even the books I study in my free time now were often written in the context of the educational system, or by the people employed by the educational system.
I don’t trust societal consensus at all. Look at the YouTube videos about quantum physics, 99% of them is some crap like “quantum physics means that human mind had a mystical power over matter”. Even if you limit yourself to seemingly smart people, half of them believe that IQ isn’t real because Nassim fucking Taleb said so. Half of the popular science does not replicate.
I suspect this may actually be the most important thing the educational system does.
You can learn from books or online videos. You can find fellow learners on social networks. You can find motivation… at random places.
But without being shown a direction, you will probably get lost in a sea of nonsense. A simple advice, such as “chemistry is the thing you should study, not alchemy” can save you decades of time you might otherwise waste learning useless things.
It is easy to notice the damage school system does, and easy to take its benefits for granted. Even if you are homeschooled, you are still exposed to people who got the right directions. (People from crazy religious families may be prevented from getting the right direction in e.g. biology or sex education, but they will still probably get the right directions about math or chemistry or geography.) This did not happen spontaneously. It was the educational system that redirected billions of people from thinking about superstitions and magic and astrology and homeopathy and whatever else, towards thinking about math and physics and chemistry and geography and history. Even if for many people the success is only partial, the fact that they even know about the useful stuff means a lot.
In today’s world, you can become good at math without spending a single day at school. But in a world where everyone in the last three generations was an autodidact, you most likely wouldn’t be good at math, because you most likely wouldn’t even know that there was such a thing as math. (Unless you would be lucky to be born in a family of mathematicians.) Instead you would spend your time learning… some nonsense. Some difficult nonsense that requires high IQ and lot of studying to get it impressively right. Just like Newton spent half of his life doing astrology.
Autodidacts are easily recognized by their unknown unknowns. They may know a lot, but what they don’t know, they usually don’t even know that it exists.
If the teacher… well, it doesn’t matter; the students not coming back is the expected outcome.
But I understand you choosing to be cooperative instead.
From inside it doesn’t feel like much of a choice. We are what we are, and I suspect that our reasoning and ethics are mostly post-hoc justifications for doing what we wanted to do anyway.
(You can choose which personality traits you practice, and what kind of people you hang out with, and both of that will change you, but I suspect that these changes are relatively small and impermanent. At least I keep finding myself revert to my old values years and decades later.)
I am a natural born cooperator. (Too bad I am not also extraverted; that could be a powerful combination, I think.) I can compete, and I had some big individual wins, but in long term it makes me tired, mentally. Contributing to other people’s projects gives me emotional energy; it is the combination of “doing something useful” and “not being ultimately responsible” that stimulates my creativity. Overcoming adversity, on the other hand, just makes me feel “I am happy that I did it anyway, but it was horrible and I am so happy that the game is finally over”. (Some people say they value success more when they had to overcome obstacles. In my math, value = success minus obstacles. You know, like “profit = income minus expenses”, not plus.)
I translate other people’s books, but don’t write my own; I comment on other people’s blogs, while my own blog only gets one or two articles a year. I am not making any sacrifices or exercising any self-control to cooperate—from my perspective, I am choosing the easy way.
And when I look at other people, I am usually surprised how little they cooperate; it seems like there are tons of unpicked low-hanging fruit that people ignore, because if they can’t grab the whole pie, they would rather let the world burn. It sometimes feels like I am already too cynical, and then again I learn that I was still too optimistic. (It’s like learning that sometimes people kill others for money. You think this makes you understand the dark side of human nature. And then you learn that someone murdered someone for $10, and you go WTF, because you assumed that “money” refers to maybe millions, not this little. Like, WTF, if someone asked me nicely for $10, there is a chance I would give them, so why would anyone kill a human being for that? And then you think that now you finally understand the dark side of human nature… until the next day you learn that someone else killed a whole group of people for mere 10 cents. I am exaggerating here, but this is how the world sometimes looks from my perspective.)
100 clones of me would probably form a cult, start optimizing the neighborhood and gradually the rest of the universe, and… dunno, probably would have some systemic weakness that would ruin them, otherwise evolution would already make more people like that, I suppose.
But this is the natural, easy way for me, and it wouldn’t work for others. We play with the cards we were dealt. What makes you happy and full of energy is probably the right way for you. (Plus some learning, of course. Each strategy can be played with lesser or greater skill. And everyone better learn about their weaknesses and how to compensate for them.)
I think this may depend a lot on the nature of the goal you want to achieve. How much it is about “X should be done” (cooperative) and how much it is about “it’s me who should do X” (competitive). Sharing the idea “X may be easier than it seems” would be good for the former goal, and bad for the latter. I think there is a saying that you can achieve much more if you stop caring about who takes the credit.
Most of my goals are of the type “X should be done”, I live in a sufficiently free country, I know people who are more agenty than me, therefore sharing information seems like the right move—it increases the chances that someone may do X (whether they invite me to their team or not).
But at least once I did something adversarial and dangerous. I didn’t tell anyone about my plans; and even years after it was done I only told about it to maybe three people.
Also, I don’t do anything “big” in politics or business where I would have to worry about my prestige, collect credits for things done, deflect things that look bad, etc.
In general, my approach to information security is to assume that any information will leak one day, especially if someone really cares, so the best I can do is create “trivial inconvenience” for the attackers. That said, trivial inconvenience can deflect most potential attackers who don’t deeply care, so it is worth doing. I do not use my full name online, I occasionally delete my old posts on social networks, etc.
Also, the older I get, the less I give a fuck, because the probability that one day I will do something important gets smaller.
Maybe just “original research” of some Wikipedia editor? Wikipedia page says “clarification needed”.
Or maybe some important context is missing, for example that young boys were socially discouraged from having sex with girls, which would make sense from the selfish perspective of the older men (for whom having heterosexual sex was considered okay, because [insert culturally approved rationalization]).
I’ve spent endless hours on message boards for [...] Slatestarcodex readers; neither of which ever really discuss [...] Scott Alexander’s writing, but rather, are just hubs for the types of oddballs who like [...] Slatestarcodex to talk about stuff that people with these personality traits like to talk about.
From the other side, this probably also explains why I don’t like the SSC/ACX related message boards.
ACX has much wider audience than LW, so “the kind of person who reads ACX” reduces to something like “an intelligent person who identifies as a contrarian and enjoys reading long texts”, which may be a group that happens to include me, but it also includes many people I prefer to avoid.
I like the fact that Scott writes about different topics, but the downside is that now neither of those topics works as a hard filter. For example, whenever Scott directly or indirectly mentions effective altruism, some people are going to write in the comments how the entire idea is stupid. (That irritates me a lot; even if I am not an EA myself, doesn’t mean that I am a fan of conspicuous talking smack about altruism in general.) So why do they keep reading the blog? Because there are also many articles on other interesting topics. So if you visit the message board, you will still find those people, but you won’t find Scott there to balance their negativity.
Offline ACX meetups are okay though. Apparently being able to walk away from the computer is a hard filter.
I think the people who say such things don’t really care, and would probably include your advice in the list of quotes they consider funny. (In other words, this is not a “mistake theory” situation.)
EDIT:
The response is too harsh, I think. There are situations where this is a useful advice. For example, if someone is acting under peer pressure, then telling them this may provide a useful outside view. As the Asch’s Conformity Experiment teaches us, the first dissenting voice can be extremely valuable. It just seems unlikely that this is the robosucka’s case.
The traditional technology used for similar purposes in some cultures is alcohol. The idea is that as alcohol impairs thinking, it impairs the ability to lie convincingly even more. Especially considering that even if one drunk person lies successfully to another drunk person, the next day the other person can reflect on the parts they remember with a sober mind.
Thus, alcohol is an imperfect lie detector with a few harmful side effects; and in cultures where it is popular, groups of friends do this together, and conspicuously avoiding it will provide evidence against your sincerity.
If I were ever unsure whether I could trust the word of a friend on an important matter, I’d think that would represent deeper issues than a mere lack of information a scan of their brain could provide.
Friendships exist on a scale. If you switch from “a stranger” to “100% trusted person” too quickly, you probably have some unpleasant surprises waiting for you in the future. Also, friendship is not transitive, and sometimes you need to know whether you can trust a friend of a friend (even when your friend says “yes”). I know some people whom I trust, but I definitely do not trust their judgment about other people.
Here is another explanation, kind of:
Taylor expansion of 1/(1+x)^2 is 1 − 2x + 3x^2 − 4x^3 + 5x^4...
When x = 1, it means that 1 − 2 + 3 − 4 + 5… = 1⁄4
But 1 − 2 + 3 − 4 + 5… can be written as 1 + 2 + 3 + 4 + 5… − 2×2 − 2×4 − 2×6...
= 1 + 2 + 3 + 4 + 5… − 2×(2 + 4 + 6...)
= 1 + 2 + 3 + 4 + 5… − 2×2×(1 + 2 + 3...)
= (1 − 2×2) × (1 + 2 + 3...)
= −3 × (1 + 2 + 3...)
So if 1 − 2 + 3 − 4 + 5… = 1⁄4, we get:
1⁄4 = −3 × (1 + 2 + 3...)
-1/12 = 1 + 2 + 3...
(Found here.)
Yeah, I think it is okay to simplify things when someone puts an explicit disclaimer like “this is a simplification” or “this is not literally true, but it is an attempt to point in a certain direction”.
But without such disclaimer, I will assume “once clickbait, always clickbait”, especially when the priors on people being stupid on internet are so high.
Spam (Google Translate)