A sane response would be to slow down the race and build a trustworthy international framework that ensures everyone can benefit from AGI. Promises would not be enough; you would need actual institutions with real authority. That’s hard, but possible. Instead, we’ve chosen a strategy of cutting China off from GPUs and hoping we can beat them to AGI before they scale up domestic production.
‘But I have so little of any of these things! You are wise and powerful. Will you not take the Ring?’
‘No!’ cried Gandalf, springing to his feet. ‘With that power I should have power too great and terrible. And over me the Ring would gain a power still greater and more deadly.’ His eyes flashed and his face was lit as by a fire within. ‘Do not tempt me! For I do not wish to become like the Dark Lord himself. Yet the way of the Ring to my heart is by pity, pity for weakness and the desire of strength to do good. Do not tempt me! I dare not take it, not even to keep it safe, unused. The wish to wield it would be too great for my strength. I shall have such need of it. Great perils lie before me.’
To be honest, I never found this very convincing as a reason other than “Tolkien needed this to not happen to make the story he has work”, which is fine, but the issue is that in LOTR, you are already dependent on what are essentially benevolent dictators, and if the king is unaligned to your values or is incompetent, the country and you will be ruined, so trying to be aligned to other values while using the ring can only have positive expected value under your value system.
And critically, this doesn’t change after Sauron’s defeat.
Another point is that if you believe that the long-term future matters most, than democracy/preventing dictatorships are very intractable even on a scale of centuries, so your solution to AI safety can’t rely on democracy working, and thus they must assume something about their values.
So outside of the logic of “we need to keep the story on rails”, I just flat-out disagree with Gandalf here.
I don’t think it’s a matter of agreeing or disagreeing—call it author fiat, but the way the One Ring works in LOTR is that it alters your mind and your goals. So Gandalf is just refusing to come close with something he knows will impair his cognition and eventually twist his goals. If I knew the power-giving artefact will also make me want things that I don’t want now, why would I take it? I would just create a future me that is the enemy of my present goals, while making my current self even more powerless.
But of course at a metaphorical level Tolkien is saying that power warps your morals. Which seemed an appropriate reference to me in context because I think it’s exactly what happened with many of those companies (certainly with Sam Altman! I’m not too soured up with Anthropic yet, though maybe it’s just that I lack enough information). The inevitable grind of keeping the power becomes so all-consuming that it ends up eroding your other values, or at least very often does so. And you get from “I want to build safe ASI for everyone’s sake!” to “ok well I guess I’ll build ASI that just obeys me, which I can do faster, but good enough since I’m not that evil, and also it’ll have a 95% chance of killing everyone but that’s better than the competition’s 96% which I just estimated out of my own ass”. Race to the bottom, and everything ends up thrown under the bus except the tiniest most unrecognizable sliver of your original goals.
Isn’t this… The entire “Rat/EA” platform?
Yes, and yet from whence openAI and anthropic
To be honest, I never found this very convincing as a reason other than “Tolkien needed this to not happen to make the story he has work”, which is fine, but the issue is that in LOTR, you are already dependent on what are essentially benevolent dictators, and if the king is unaligned to your values or is incompetent, the country and you will be ruined, so trying to be aligned to other values while using the ring can only have positive expected value under your value system.
And critically, this doesn’t change after Sauron’s defeat.
Another point is that if you believe that the long-term future matters most, than democracy/preventing dictatorships are very intractable even on a scale of centuries, so your solution to AI safety can’t rely on democracy working, and thus they must assume something about their values.
So outside of the logic of “we need to keep the story on rails”, I just flat-out disagree with Gandalf here.
I don’t think it’s a matter of agreeing or disagreeing—call it author fiat, but the way the One Ring works in LOTR is that it alters your mind and your goals. So Gandalf is just refusing to come close with something he knows will impair his cognition and eventually twist his goals. If I knew the power-giving artefact will also make me want things that I don’t want now, why would I take it? I would just create a future me that is the enemy of my present goals, while making my current self even more powerless.
But of course at a metaphorical level Tolkien is saying that power warps your morals. Which seemed an appropriate reference to me in context because I think it’s exactly what happened with many of those companies (certainly with Sam Altman! I’m not too soured up with Anthropic yet, though maybe it’s just that I lack enough information). The inevitable grind of keeping the power becomes so all-consuming that it ends up eroding your other values, or at least very often does so. And you get from “I want to build safe ASI for everyone’s sake!” to “ok well I guess I’ll build ASI that just obeys me, which I can do faster, but good enough since I’m not that evil, and also it’ll have a 95% chance of killing everyone but that’s better than the competition’s 96% which I just estimated out of my own ass”. Race to the bottom, and everything ends up thrown under the bus except the tiniest most unrecognizable sliver of your original goals.