I am somewhat concerned that e.g. Putin’s CEV is mediocre/bad, but I am more concerned that he wouldn’t reflect well at all. I think the default outcome of Putin has all of the power is not Putin’s CEV gets implemented — it’s much dumber.
Yep, I agree that this is sensitive to the context in which Putin (or whoever) might end up in power. I am using him as a foil for a context in which he would have access to some substantially aligned/powerful AI system, and so wouldn’t make dumb mistakes. But most worlds where he ends up in power would probably be much dumber than that, and this is not an argument against those.
I am interested in figuring out whether this is true; I don’t have a strong view:
Even if Putin controls operator-aligned superintelligence, it’s very likely that he doesn’t reflect or change his mind on crucial stuff, nor do helpful meta stuff like intelligence augmentation. People don’t like changing their mind. They might not listen to AIs saying stuff that [is weird / contradicts their convictions / implies they’re bad]. Maybe Putin doesn’t even launch the von Neumann probes, much less do acausal trade (assuming that works out).
I really have trouble imagining this happening, at least for someone like Putin (there are other people where I would find this more plausible).
Like, Putin clearly understands the value of greater intelligence. He understands the strategic usefulness of having access to more information and understanding more about the world. He is not an incompetent man!
And the future is long, and he probably doesn’t want to die, so this could potentially play out over many decades if not centuries or millennia. He could do this all as slowly as he wanted to, and I doubt he would wake up one day and say “I would like to be dumber than the day before”.
I feel like in order to arrive at stagnation over those time periods, you would need to actively optimize for stagnation.
Though he could get greater intelligence and more information/understanding about the world without doing any reflection on his values. This seems fairly likely to me. People tend to be not that interested in reflecting on their values. He might even want to lock in his current values, since that’s rational according to his current values.
Most coarse grainings of this post are very stupid points, and you should expect that much of your point is lost in transmission. I think you should be a lot more careful that if you’re going to write something, the oversimplifications of it are not easily misinterpreted, so that it’s harder for various forms of adversary or merely-stupid reader to distort what you mean. I’m posting this despite anticipating pushback of the form “talking to people who misinterpret is a waste of time and should not be done”, and I think that’s wrong.
You… seem triggered in a way that doesn’t seem very helpful. Please comment very differently, or not on this post, or I’ll ban you from my posts.
I don’t want to make my writing adversarially robust to adversarial readers, that way lies the death of the joy of writing, as well as the path to boring writing. I am not that worried about people distorting what I mean, and if they do, I am pretty good at showing up and clarifying what I mean.
I agree that if a reader ends up skimming my post, I would like them not to end up with wrong beliefs, so that part is a virtue I aspire to.
(Edit: I originally said a dumber thing here, sorry about that)
Hmm, noted. I didn’t intend as harsh a tone as I reread it in now. Apologies for that. Fwiw, my other comments are also not meant in a harsh tone, and I hope they don’t read as much; I’m just trying to be correct here.
I do think there’s something that you’re missing about the effects of your posts based on the recent pattern of them, and that some increased adversarial robustness would reduce the severity of politically impactful misinterpretations. But it seems I’m not the best person to communicate this to you given my emotional dynamics, so again, apologies.
I’ve only posted two posts recently before this one, only one of which was controversial, which feels a bit ambitious to try to draw a pattern from. The most recent one did have some “politically impactful misrepresentations”, but I knew that that one would be controversial/tricky going in, and it overall still looks like it’s been well-received.
We will see whether you are correct in predicting a pattern, but my guess is there won’t be much of one.
I am somewhat concerned that e.g. Putin’s CEV is mediocre/bad, but I am more concerned that he wouldn’t reflect well at all. I think the default outcome of Putin has all of the power is not Putin’s CEV gets implemented — it’s much dumber.
Yep, I agree that this is sensitive to the context in which Putin (or whoever) might end up in power. I am using him as a foil for a context in which he would have access to some substantially aligned/powerful AI system, and so wouldn’t make dumb mistakes. But most worlds where he ends up in power would probably be much dumber than that, and this is not an argument against those.
I am interested in figuring out whether this is true; I don’t have a strong view:
Even if Putin controls operator-aligned superintelligence, it’s very likely that he doesn’t reflect or change his mind on crucial stuff, nor do helpful meta stuff like intelligence augmentation. People don’t like changing their mind. They might not listen to AIs saying stuff that [is weird / contradicts their convictions / implies they’re bad]. Maybe Putin doesn’t even launch the von Neumann probes, much less do acausal trade (assuming that works out).
I really have trouble imagining this happening, at least for someone like Putin (there are other people where I would find this more plausible).
Like, Putin clearly understands the value of greater intelligence. He understands the strategic usefulness of having access to more information and understanding more about the world. He is not an incompetent man!
And the future is long, and he probably doesn’t want to die, so this could potentially play out over many decades if not centuries or millennia. He could do this all as slowly as he wanted to, and I doubt he would wake up one day and say “I would like to be dumber than the day before”.
I feel like in order to arrive at stagnation over those time periods, you would need to actively optimize for stagnation.
Though he could get greater intelligence and more information/understanding about the world without doing any reflection on his values. This seems fairly likely to me. People tend to be not that interested in reflecting on their values. He might even want to lock in his current values, since that’s rational according to his current values.
Most coarse grainings of this post are very stupid points, and you should expect that much of your point is lost in transmission. I think you should be a lot more careful that if you’re going to write something, the oversimplifications of it are not easily misinterpreted, so that it’s harder for various forms of adversary or merely-stupid reader to distort what you mean. I’m posting this despite anticipating pushback of the form “talking to people who misinterpret is a waste of time and should not be done”, and I think that’s wrong.
You… seem triggered in a way that doesn’t seem very helpful. Please comment very differently, or not on this post, or I’ll ban you from my posts.
I don’t want to make my writing adversarially robust to adversarial readers, that way lies the death of the joy of writing, as well as the path to boring writing. I am not that worried about people distorting what I mean, and if they do, I am pretty good at showing up and clarifying what I mean.
I agree that if a reader ends up skimming my post, I would like them not to end up with wrong beliefs, so that part is a virtue I aspire to.
(Edit: I originally said a dumber thing here, sorry about that)
Hmm, noted. I didn’t intend as harsh a tone as I reread it in now. Apologies for that. Fwiw, my other comments are also not meant in a harsh tone, and I hope they don’t read as much; I’m just trying to be correct here.
I do think there’s something that you’re missing about the effects of your posts based on the recent pattern of them, and that some increased adversarial robustness would reduce the severity of politically impactful misinterpretations. But it seems I’m not the best person to communicate this to you given my emotional dynamics, so again, apologies.
I’ve only posted two posts recently before this one, only one of which was controversial, which feels a bit ambitious to try to draw a pattern from. The most recent one did have some “politically impactful misrepresentations”, but I knew that that one would be controversial/tricky going in, and it overall still looks like it’s been well-received.
We will see whether you are correct in predicting a pattern, but my guess is there won’t be much of one.