[I’ll keep going since this seems important, though sort of obscured/slippery; but feel free to duck out.]
Mmm, there’s an ambiguity in the word “our” that I think is creating confusion.
I think there’s also ambiguity in “job”. I think it makes sense for it to be “up to” Eliezer in the sense of being Eliezer’s role (role as in task allocation, not as in Should-field, and not as in performance in a play).
is responsible for sorting out AI risk
Like, I think I heard the OP as maybe saying “giving and taking responsibility for AI alignment is acting at the wrong level”, which is ambiguous because “responsibility” is ambiguous; who is taking whom to be responsible, and how are they doing that? Are they threatening punishment? Are they making plans on that assumption? Are they telling other people to make plans on that assumption? Etc.
I think we agree that:
Research goes vastly or even infinitely better when motivated by concrete considerations about likely futures, or by what is called curiosity.
Doubling down on Shoulds (whether intra- or inter-personal) is rarely helpful and usually harmful.
Participating in Shoulds (giving or receiving) is very prone to be or become part of an egregore.
I don’t know whether we agree that:
There is a kind of “mental ambition” which is the only thing that has a chance at crossing the gap to real power from where any of us is, however utterly fucking gargantuan that gap may be.
There is a way of being scared about the problem (including how big and important it is, though not primarily in those words) that is healthy and a part of at least one correct way of orienting.
Sometimes “Damn, I should exercise” is what someone says when they feel bloopiness in their body and want to move it, but haven’t found a fun way to move their body.
It’s not correct that “Sorting out AI alignment in computers is focusing entirely on the endgame. That’s not where the causal power is.”, because ideas are to a great extent had by small numbers of people, and ideas have a large causal effect on what sort of control ends up being exercised. I could interpret this statement as a true proposition, though, if it’s said to someone (and implicitly, about just that person) who is sufficiently embedded in an egregore that they just can’t feasibly aim at the important technical problems (which I think we’d agree is very common).
If the whole world were only exactly 90% unified on AI alignment being an issue, it would NOT just be a problem to solve. That is, it would still probably spell doom, if the other 10% are still incentivized to go full steam ahead on AGI, and the technical problem turns out to be really hard, and the technical problem isn’t something that can be solved just by throwing money and people at it.
A top priority in free-willing into existence “The kind of math/programming/etc. needed to solve it [which] is literally superhuman”, is to actually work on it.
…I think I heard the OP as maybe saying “giving and taking responsibility for AI alignment is acting at the wrong level”, which is ambiguous because “responsibility” is ambiguous; who is taking whom to be responsible, and how are they doing that? Are they threatening punishment? Are they making plans on that assumption? Are they telling other people to make plans on that assumption? Etc.
Ah!
No, for me, responsibility is a fact. Like asking who has admin powers over these posts.
This isn’t a precise definition. It’s a very practical one. I’m responsible for what’s in range of my capacity to choose. I’m responsible for how my fingers move. I’m not responsible for who gets elected POTUS.
In practice people seem to add the emotional tone of blame or shame or something to “responsibility”. Like “You’re responsible for your credit score.” Blame is a horrid organizing principle and it obfuscates the question of who can affect what. Who is capable of responding (response-able) as opposed to forced to mechanically react.
Stupefaction encourages this weird thing where people pretend they’re responsible for some things they in fact cannot control (and vice versa). My comment about exercise is pointing at this. It’s not that using inner pain can’t work sometimes for some people. It’s that whether it can or can’t work seems to have close to zero effect on whether people try and continue to try. This is just bonkers.
So, like, if you want to be healthy but you can’t seem to do the things you think make sense for you to do, “be healthy” isn’t your responsibility. Because it can’t be. Out of your range. Not your job.
Likewise, it’s not a toddler’s job to calm themselves down while they’re having a meltdown. They can’t. This falls on the adults around the toddler — unless those adults haven’t learned the skill. In which case they can’t be responsible for the toddler. Not as a judgment. As a fact.
Does that clarify?
I think we agree that:
Research goes vastly or even infinitely better when motivated by concrete considerations about likely futures, or by what is called curiosity.
Doubling down on Shoulds (whether intra- or inter-personal) is rarely helpful and usually harmful.
Participating in Shoulds (giving or receiving) is very prone to be or become part of an egregore.
Basically yes.
Nuance: Participating in “should”s is very prone to feeding stupefaction and often comes from a stupefying egregore. More precise than “be or become part of an egregore”.
But the underline tone of “Participating in ’should’s is a bad idea” is there for sure.
There is a kind of “mental ambition” which is the only thing that has a chance at crossing the gap to real power from where any of us is, however utterly fucking gargantuan that gap may be.
Depends on what you mean by “ambition”.
I do think there’s a thing that extends influence (and in extreme cases lets individuals operate on the god level — see e.g. Putin), and this works through the mind for sure. Sort of like working through a telescope extends your senses.
There is a way of being scared about the problem (including how big and important it is, though not primarily in those words) that is healthy and a part of at least one correct way of orienting.
Yes, as literally stated, I agree.
I don’t think most people have reliable access to this way of being scared in practice though. Most fear becomes food for unFriendly hypercreatures.
Sometimes “Damn, I should exercise” is what someone says when they feel bloopiness in their body and want to move it, but haven’t found a fun way to move their body.
Agreed, and also irrelevant. Did my spelling out of responsibility up above clarify why?
It’s not correct that “Sorting out AI alignment in computers is focusing entirely on the endgame. That’s not where the causal power is.”, because ideas are to a great extent had by small numbers of people, and ideas have a large causal effect on what sort of control ends up being exercised. I could interpret this statement as a true proposition, though, if it’s said to someone (and implicitly, about just that person) who is sufficiently embedded in an egregore that they just can’t feasibly aim at the important technical problems (which I think we’d agree is very common).
I don’t quite understand this objection. I think you’re saying it’s possible for one or a few individuals to have a key technical idea that outwits all the egregores…? Sure, that’s possible, but that doesn’t seem like the winning strategy to aim for here by a long shot. It seemed worth trying 20 years ago, and I’m glad someone tried. Now it’s way, way more obvious (at least to me) that that path just isn’t a viable one. Now we know.
(I think we knew this five years ago too. We just didn’t know what else to do and so kind of ignored this point.)
If the whole world were only exactly 90% unified on AI alignment being an issue, it would NOT just be a problem to solve. That is, it would still probably spell doom, if the other 10% are still incentivized to go full steam ahead on AGI, and the technical problem turns out to be really hard, and the technical problem isn’t something that can be solved just by throwing money and people at it.
Yeah, I think we just disagree here. Where are those 10% getting their resources from? How are they operating without any effects that the 90% can notice? What was the process by which the 90% got aligned? I have a hard time imagining a plausible world here that doesn’t just pull the plug on that 10% and either persuade them or make them irrelevant.
Also, I do think that 90% would work on the technical problem. I don’t mean to say no one would. I mean that the technical problem is downstream of the social one.
A top priority in free-willing into existence “The kind of math/programming/etc. needed to solve it [which] is literally superhuman”, is to actually work on it.
Sure. I’m not saying no one should work on this. I’m saying that these calls to collective action to work on it without addressing the current hostile superintelligences hacking our minds and cultures is just ludicrous.
It clarifies some of your statements, yeah. (I think it’s not the normal usage; related to but not equal to blame, there’s roles, and causal fault routing through peoples’ expectations, like “So-and-so took responsibility for calming down the toddler, so we left, but they weren’t able, that’s why there wasn’t anyone there who successfully calmed them down”.)
I don’t think most people have reliable access to this way of being scared in practice though. Most fear becomes food for unFriendly hypercreatures.
Agreed; possibly I’d be more optimistic than you about some instances of fear, on the margin, but whatever. Someone Should would be helping others if they were to write about healthy fear...
Agreed, and also irrelevant. Did my spelling out of responsibility up above clarify why?
Not exactly? I think you’re saying, the point is, they can’t make themselves exercise, so they can’t be responsible, and it doesn’t help to bang their head against a non-motivating wall.
What’s important to me here is something like: there’s (usually? often?) some things “right there inside” the Should which are very worth saving. Like, it’s obviously not a coincidence which Shoulds people have, and the practice of Shoulding oneself isn’t only there because of egregores. I think that the Shoulds often have to do with what people really care for, and that their caring shows itself (obscurely, mediatedly, and cooptably/fakeably) in the application of “external” willpower. (I think of Dua Lipa’s song New Rules )
So I want to avoid people being sort of gaslit into not trusting their reason—not trusting that when they reach an abstract conclusion about what would have consequences they like, it’s worth putting weight on—by bluntly pressuring them to treat their explicit/symbolic “decisions” as suspect. (I mean, they are suspect, and as you argue, they aren’t exactly “decisions” if you then have to try and fail to make yourself carry them out, and clearly all is not well with the supposed practice of being motivated by abstract conclusions. Nevertheless, you maybe thought they were decisions and were intending to make that decision, and your intention to make the decision to exercise / remove X-risk was likely connected to real care.)
Now it’s way, way more obvious (at least to me) that that path just isn’t a viable one.
Huh. Are you saying that you’ve updated to think that solving technical AI alignment is so extremely difficult that there’s just no chance, because that doesn’t sound like your other statements? Maybe you’re saying that we / roughly all people can’t even really work on alignment, because being in egregores messes with one’s ability to access what an AI is supposed to be aligned to (and therefore to analyze the hypothetical situation of alignedness), so “purely technical” alignment work is doomed?
I’m saying that technical alignment seems (1) necessary and (2) difficult and (3) maaaaaybe feasible. So there’s causal power there. If you’re saying, people can’t decide to really try solving alignment, so there’s no causal power there… Well, I think that’s mostly right in some sense, but not the right way to use the concept of causal power. There’s still causal power in the node “technical alignment theory”. For most people there’s no causal power in “decide to solve alignment, and then beat yourself about not doing it”. You have to track these separately! Otherwise you say
Sorting out AI alignment in computers is focusing entirely on the endgame. That’s not where the causal power is.
Instead of saying what I think you mean(??), which is “you (almost all readers) can’t decide to help with technical AI alignment, so pressing the button in your head labeled ‘solve alignment’ just hurts yourself and makes you good egregore food, and if you want to solve AI alignment you have to first sort that out”. Maybe I’m missing you though!
I have a hard time imagining a plausible world here that doesn’t just pull the plug on that 10% and either persuade them or make them irrelevant.
Maybe we have different ideas of “unified”? I was responding to
If the whole world were unified on AI alignment being an issue, it’d just be a problem to solve.
The problem that’s upstream of this is the lack of will. Same thing with cryonics really. Or aging.[....] The problem is that people’s minds aren’t clear enough to look at the problem for real.
I agree with:
If the 90% of the world were unified on X being an issue, it’d just be a problem to solve.
if X is aging or cryonics, because aging and cryonics aren’t things that have terrible deadlines imposed by a smallish, unilaterally acting, highly economically incentivized research field.
Where are those 10% getting their resources from?
Investors who don’t particularly have to be in the public eye.
How are they operating without any effects that the 90% can notice?
By camouflaging their activities. Generally governmentally imposed restrictions can be routed around, I think, given enough incentive (cf. tax evasion)? Especially in a realm where everything is totally ethereal electrical signals that most people don’t understand (except the server farms).
What was the process by which the 90% got aligned?
I don’t know. Are you perhaps suggesting that your vision of human alignedness implies that the remaining 10% would also become aligned, e.g. because everyone else is so much happier and alive, or can offer arguments that are very persuasive to their aligned souls? Or, it implies competence to really tactically prevent the 10% from doing mad science? Something like that is vaguely plausible, and therefore indeed promising, but not obviously the case!
I’m saying that these calls to collective action to work on it without addressing thecurrenthostile superintelligences hacking our minds and cultures is just ludicrous.
Agreed.… I think.… though I’d maybe admit a lot more than you would as “just stating propositions” and therefore fine. IDK. Examples could be interesting (and the OP might possibly have been less confusing to me with some examples of what you’re responding to).
[I’ll keep going since this seems important, though sort of obscured/slippery; but feel free to duck out.]
I think there’s also ambiguity in “job”. I think it makes sense for it to be “up to” Eliezer in the sense of being Eliezer’s role (role as in task allocation, not as in Should-field, and not as in performance in a play).
Like, I think I heard the OP as maybe saying “giving and taking responsibility for AI alignment is acting at the wrong level”, which is ambiguous because “responsibility” is ambiguous; who is taking whom to be responsible, and how are they doing that? Are they threatening punishment? Are they making plans on that assumption? Are they telling other people to make plans on that assumption? Etc.
I think we agree that:
Research goes vastly or even infinitely better when motivated by concrete considerations about likely futures, or by what is called curiosity.
Doubling down on Shoulds (whether intra- or inter-personal) is rarely helpful and usually harmful.
Participating in Shoulds (giving or receiving) is very prone to be or become part of an egregore.
I don’t know whether we agree that:
There is a kind of “mental ambition” which is the only thing that has a chance at crossing the gap to real power from where any of us is, however utterly fucking gargantuan that gap may be.
There is a way of being scared about the problem (including how big and important it is, though not primarily in those words) that is healthy and a part of at least one correct way of orienting.
Sometimes “Damn, I should exercise” is what someone says when they feel bloopiness in their body and want to move it, but haven’t found a fun way to move their body.
It’s not correct that “Sorting out AI alignment in computers is focusing entirely on the endgame. That’s not where the causal power is.”, because ideas are to a great extent had by small numbers of people, and ideas have a large causal effect on what sort of control ends up being exercised. I could interpret this statement as a true proposition, though, if it’s said to someone (and implicitly, about just that person) who is sufficiently embedded in an egregore that they just can’t feasibly aim at the important technical problems (which I think we’d agree is very common).
If the whole world were only exactly 90% unified on AI alignment being an issue, it would NOT just be a problem to solve. That is, it would still probably spell doom, if the other 10% are still incentivized to go full steam ahead on AGI, and the technical problem turns out to be really hard, and the technical problem isn’t something that can be solved just by throwing money and people at it.
A top priority in free-willing into existence “The kind of math/programming/etc. needed to solve it [which] is literally superhuman”, is to actually work on it.
Ah!
No, for me, responsibility is a fact. Like asking who has admin powers over these posts.
This isn’t a precise definition. It’s a very practical one. I’m responsible for what’s in range of my capacity to choose. I’m responsible for how my fingers move. I’m not responsible for who gets elected POTUS.
In practice people seem to add the emotional tone of blame or shame or something to “responsibility”. Like “You’re responsible for your credit score.” Blame is a horrid organizing principle and it obfuscates the question of who can affect what. Who is capable of responding (response-able) as opposed to forced to mechanically react.
Stupefaction encourages this weird thing where people pretend they’re responsible for some things they in fact cannot control (and vice versa). My comment about exercise is pointing at this. It’s not that using inner pain can’t work sometimes for some people. It’s that whether it can or can’t work seems to have close to zero effect on whether people try and continue to try. This is just bonkers.
So, like, if you want to be healthy but you can’t seem to do the things you think make sense for you to do, “be healthy” isn’t your responsibility. Because it can’t be. Out of your range. Not your job.
Likewise, it’s not a toddler’s job to calm themselves down while they’re having a meltdown. They can’t. This falls on the adults around the toddler — unless those adults haven’t learned the skill. In which case they can’t be responsible for the toddler. Not as a judgment. As a fact.
Does that clarify?
Basically yes.
Nuance: Participating in “should”s is very prone to feeding stupefaction and often comes from a stupefying egregore. More precise than “be or become part of an egregore”.
But the underline tone of “Participating in ’should’s is a bad idea” is there for sure.
Depends on what you mean by “ambition”.
I do think there’s a thing that extends influence (and in extreme cases lets individuals operate on the god level — see e.g. Putin), and this works through the mind for sure. Sort of like working through a telescope extends your senses.
Yes, as literally stated, I agree.
I don’t think most people have reliable access to this way of being scared in practice though. Most fear becomes food for unFriendly hypercreatures.
Agreed, and also irrelevant. Did my spelling out of responsibility up above clarify why?
I don’t quite understand this objection. I think you’re saying it’s possible for one or a few individuals to have a key technical idea that outwits all the egregores…? Sure, that’s possible, but that doesn’t seem like the winning strategy to aim for here by a long shot. It seemed worth trying 20 years ago, and I’m glad someone tried. Now it’s way, way more obvious (at least to me) that that path just isn’t a viable one. Now we know.
(I think we knew this five years ago too. We just didn’t know what else to do and so kind of ignored this point.)
Yeah, I think we just disagree here. Where are those 10% getting their resources from? How are they operating without any effects that the 90% can notice? What was the process by which the 90% got aligned? I have a hard time imagining a plausible world here that doesn’t just pull the plug on that 10% and either persuade them or make them irrelevant.
Also, I do think that 90% would work on the technical problem. I don’t mean to say no one would. I mean that the technical problem is downstream of the social one.
Sure. I’m not saying no one should work on this. I’m saying that these calls to collective action to work on it without addressing the current hostile superintelligences hacking our minds and cultures is just ludicrous.
It clarifies some of your statements, yeah. (I think it’s not the normal usage; related to but not equal to blame, there’s roles, and causal fault routing through peoples’ expectations, like “So-and-so took responsibility for calming down the toddler, so we left, but they weren’t able, that’s why there wasn’t anyone there who successfully calmed them down”.)
Agreed; possibly I’d be more optimistic than you about some instances of fear, on the margin, but whatever. Someone
Shouldwould be helping others if they were to write about healthy fear...Not exactly? I think you’re saying, the point is, they can’t make themselves exercise, so they can’t be responsible, and it doesn’t help to bang their head against a non-motivating wall.
What’s important to me here is something like: there’s (usually? often?) some things “right there inside” the Should which are very worth saving. Like, it’s obviously not a coincidence which Shoulds people have, and the practice of Shoulding oneself isn’t only there because of egregores. I think that the Shoulds often have to do with what people really care for, and that their caring shows itself (obscurely, mediatedly, and cooptably/fakeably) in the application of “external” willpower. (I think of Dua Lipa’s song New Rules )
So I want to avoid people being sort of gaslit into not trusting their reason—not trusting that when they reach an abstract conclusion about what would have consequences they like, it’s worth putting weight on—by bluntly pressuring them to treat their explicit/symbolic “decisions” as suspect. (I mean, they are suspect, and as you argue, they aren’t exactly “decisions” if you then have to try and fail to make yourself carry them out, and clearly all is not well with the supposed practice of being motivated by abstract conclusions. Nevertheless, you maybe thought they were decisions and were intending to make that decision, and your intention to make the decision to exercise / remove X-risk was likely connected to real care.)
Huh. Are you saying that you’ve updated to think that solving technical AI alignment is so extremely difficult that there’s just no chance, because that doesn’t sound like your other statements? Maybe you’re saying that we / roughly all people can’t even really work on alignment, because being in egregores messes with one’s ability to access what an AI is supposed to be aligned to (and therefore to analyze the hypothetical situation of alignedness), so “purely technical” alignment work is doomed?
I’m saying that technical alignment seems (1) necessary and (2) difficult and (3) maaaaaybe feasible. So there’s causal power there. If you’re saying, people can’t decide to really try solving alignment, so there’s no causal power there… Well, I think that’s mostly right in some sense, but not the right way to use the concept of causal power. There’s still causal power in the node “technical alignment theory”. For most people there’s no causal power in “decide to solve alignment, and then beat yourself about not doing it”. You have to track these separately! Otherwise you say
Instead of saying what I think you mean(??), which is “you (almost all readers) can’t decide to help with technical AI alignment, so pressing the button in your head labeled ‘solve alignment’ just hurts yourself and makes you good egregore food, and if you want to solve AI alignment you have to first sort that out”. Maybe I’m missing you though!
Maybe we have different ideas of “unified”? I was responding to
I agree with:
if X is aging or cryonics, because aging and cryonics aren’t things that have terrible deadlines imposed by a smallish, unilaterally acting, highly economically incentivized research field.
Investors who don’t particularly have to be in the public eye.
By camouflaging their activities. Generally governmentally imposed restrictions can be routed around, I think, given enough incentive (cf. tax evasion)? Especially in a realm where everything is totally ethereal electrical signals that most people don’t understand (except the server farms).
I don’t know. Are you perhaps suggesting that your vision of human alignedness implies that the remaining 10% would also become aligned, e.g. because everyone else is so much happier and alive, or can offer arguments that are very persuasive to their aligned souls? Or, it implies competence to really tactically prevent the 10% from doing mad science? Something like that is vaguely plausible, and therefore indeed promising, but not obviously the case!
Agreed.… I think.… though I’d maybe admit a lot more than you would as “just stating propositions” and therefore fine. IDK. Examples could be interesting (and the OP might possibly have been less confusing to me with some examples of what you’re responding to).