I don’t have a super strong take on the strong form of the orthogonality thesis, but I still understand what Eliezer is talking about to be about “if you were to design a mind from scratch, there exists a configuration which is not more complicated than the goal itself that would allow it to effectively pursue that goal”, which is really very different from “Among agents that arise, persist, self-improve, and compete in rich environments, goals...”.
I understand his clarification here to apply to both the strong and the weak thesis. Both the strong and the weak thesis are about the constraints you would face when building a mind pursuing an arbitrary objective from scratch with a deep understanding of intelligence, not what constraints you would face if you were to try to grow a mind, or find a mind via complicated competitive search over programs.
The weak thesis states that it possible to build a mind pursuing any goal. The strong thesis states that for any given level of intelligence, you can make a mind pursuing that goal, and the additional difficulty of doing so would be just proportional to the complexity of the goal.
It definitely does not say (yes even if you talk about the strong orthogonality thesis) that if you tried to grow minds in competitive environments, that any goal is as likely as any other. That is obviously false. Trivially false. Of course there exist goals more likely to arise out of competitive dynamics.
It only says that if you had a universe devoid of any competing agents, you could make a mind that optimized the universe according to any criterion, you could do so without too much difficulty, if you had a deep and fundamental understanding of intelligence.
Is this true? I don’t know, there exist some really tricky goals (one of my favorite tricky ones is “tile the universe in paper clips while believing that 4 is prime”). Can you make a mind that optimizes the universe according to this goal? I don’t know, it sure seems to add more trickiness than the complexity of the goal, which appears relatively simple. But it’s also hard to rule out.
It was difficult to test the extent of this confusion without accidentally resolving it. I posted one poll asking ‘what the orthogonality thesis implies about [a relationship between] intelligence and terminal goals’, to which 14 of 16 respondents selected the option ‘there is no relationship or only an extremely weak relationship between intelligence and goals’
many of the claims you seem to be responding to weren’t in the text, so I can only acknowedef that they make sense but do not change my argument.
the strong orthogonality thesis says that intelligence and goals are orthogonal. that is what I am disputing.
I think the relevant part of your reply is the one where you specify it should only apply to “a universe devoid of competing agent”. i touch on the argument in the main post, but i go into more detail here.
I didn’t try much to read the OP, but just FYI, it’s hard to track what you’re trying to say if you don’t stick to precise claims. At the beginning of the post you have:
A reflective, recursively improving intelligence should be expected to remain bound to a semantically thin “terminal goal” that emerged during training.
as the claim you’re trying to argue against. But at the top there’s this:
Edit: if no one thinks an agent can become superintelligent and contest the lightcone while maintaining arbitrarily stupid goals, thats great! I’m only interested in refuting the version that would allow for a superintelligence AND a total absence of value.
Well, which one is it? “Should be expected” or “can”?
By the way, I totally agree that there’s a bunch of confusing tension here, but as others have pointed out, this is a standard view (ontological crises etc.).
I think you’re maybe not understanding something fairly basic, which I could gesture at by saying something like “well but imagine that you tried to keep making diamonds, in good faith, even as you got smarter and smarter”. If you tried to do this, you could do something along those lines. Yes you’d have ontological crises, but an important thing to see here is simply that there are many many very different things you could end up doing with the universe. You’re summarize the differences in those arrangements as being thin / dumb / valueless values, but I don’t get that. As an illustration, there’s also an infinite variety of ways to have more and more intelligence. E.g. there’s more and more math in more and more different flavors and directions. There’s more and more different ways for you to be as an intelligence.
I mean, I’ve kinda read the thing, but it’s not very legible to me.
It kinda sounds like you’re just saying “alignment to non-instrumental goals is hard”, which everyone agrees with, and then you’re also saying “I like it when there’s more intelligence, I think that’s valuable, regardless of any other features of what the intelligence is trying to do besides get more intelligence”, which seems false and bad and you haven’t argued for it here AFAICT. But maybe I’m not understanding.
I don’t have a super strong take on the strong form of the orthogonality thesis, but I still understand what Eliezer is talking about to be about “if you were to design a mind from scratch, there exists a configuration which is not more complicated than the goal itself that would allow it to effectively pursue that goal”, which is really very different from “Among agents that arise, persist, self-improve, and compete in rich environments, goals...”.
I understand his clarification here to apply to both the strong and the weak thesis. Both the strong and the weak thesis are about the constraints you would face when building a mind pursuing an arbitrary objective from scratch with a deep understanding of intelligence, not what constraints you would face if you were to try to grow a mind, or find a mind via complicated competitive search over programs.
The weak thesis states that it possible to build a mind pursuing any goal. The strong thesis states that for any given level of intelligence, you can make a mind pursuing that goal, and the additional difficulty of doing so would be just proportional to the complexity of the goal.
It definitely does not say (yes even if you talk about the strong orthogonality thesis) that if you tried to grow minds in competitive environments, that any goal is as likely as any other. That is obviously false. Trivially false. Of course there exist goals more likely to arise out of competitive dynamics.
It only says that if you had a universe devoid of any competing agents, you could make a mind that optimized the universe according to any criterion, you could do so without too much difficulty, if you had a deep and fundamental understanding of intelligence.
Is this true? I don’t know, there exist some really tricky goals (one of my favorite tricky ones is “tile the universe in paper clips while believing that 4 is prime”). Can you make a mind that optimizes the universe according to this goal? I don’t know, it sure seems to add more trickiness than the complexity of the goal, which appears relatively simple. But it’s also hard to rule out.
from the EA forums post linked in the edit.
many of the claims you seem to be responding to weren’t in the text, so I can only acknowedef that they make sense but do not change my argument.
the strong orthogonality thesis says that intelligence and goals are orthogonal. that is what I am disputing.
I think the relevant part of your reply is the one where you specify it should only apply to “a universe devoid of competing agent”. i touch on the argument in the main post, but i go into more detail here.
I didn’t try much to read the OP, but just FYI, it’s hard to track what you’re trying to say if you don’t stick to precise claims. At the beginning of the post you have:
as the claim you’re trying to argue against. But at the top there’s this:
Well, which one is it? “Should be expected” or “can”?
By the way, I totally agree that there’s a bunch of confusing tension here, but as others have pointed out, this is a standard view (ontological crises etc.).
I think you’re maybe not understanding something fairly basic, which I could gesture at by saying something like “well but imagine that you tried to keep making diamonds, in good faith, even as you got smarter and smarter”. If you tried to do this, you could do something along those lines. Yes you’d have ontological crises, but an important thing to see here is simply that there are many many very different things you could end up doing with the universe. You’re summarize the differences in those arrangements as being thin / dumb / valueless values, but I don’t get that. As an illustration, there’s also an infinite variety of ways to have more and more intelligence. E.g. there’s more and more math in more and more different flavors and directions. There’s more and more different ways for you to be as an intelligence.
may I recommend you read the thing? ive gone through most of the arguments you proposed.
I mean, I’ve kinda read the thing, but it’s not very legible to me.
It kinda sounds like you’re just saying “alignment to non-instrumental goals is hard”, which everyone agrees with, and then you’re also saying “I like it when there’s more intelligence, I think that’s valuable, regardless of any other features of what the intelligence is trying to do besides get more intelligence”, which seems false and bad and you haven’t argued for it here AFAICT. But maybe I’m not understanding.
sorry, I don’t think it makes sense for me to discuss your opinions on something you kinda read.
The claims I am responding to are straightforwardly in the text. Like I am literally quoting the text in my first paragraph.