I don’t think it’s a matter of agreeing or disagreeing—call it author fiat, but the way the One Ring works in LOTR is that it alters your mind and your goals. So Gandalf is just refusing to come close with something he knows will impair his cognition and eventually twist his goals. If I knew the power-giving artefact will also make me want things that I don’t want now, why would I take it? I would just create a future me that is the enemy of my present goals, while making my current self even more powerless.
But of course at a metaphorical level Tolkien is saying that power warps your morals. Which seemed an appropriate reference to me in context because I think it’s exactly what happened with many of those companies (certainly with Sam Altman! I’m not too soured up with Anthropic yet, though maybe it’s just that I lack enough information). The inevitable grind of keeping the power becomes so all-consuming that it ends up eroding your other values, or at least very often does so. And you get from “I want to build safe ASI for everyone’s sake!” to “ok well I guess I’ll build ASI that just obeys me, which I can do faster, but good enough since I’m not that evil, and also it’ll have a 95% chance of killing everyone but that’s better than the competition’s 96% which I just estimated out of my own ass”. Race to the bottom, and everything ends up thrown under the bus except the tiniest most unrecognizable sliver of your original goals.
I don’t think it’s a matter of agreeing or disagreeing—call it author fiat, but the way the One Ring works in LOTR is that it alters your mind and your goals. So Gandalf is just refusing to come close with something he knows will impair his cognition and eventually twist his goals. If I knew the power-giving artefact will also make me want things that I don’t want now, why would I take it? I would just create a future me that is the enemy of my present goals, while making my current self even more powerless.
But of course at a metaphorical level Tolkien is saying that power warps your morals. Which seemed an appropriate reference to me in context because I think it’s exactly what happened with many of those companies (certainly with Sam Altman! I’m not too soured up with Anthropic yet, though maybe it’s just that I lack enough information). The inevitable grind of keeping the power becomes so all-consuming that it ends up eroding your other values, or at least very often does so. And you get from “I want to build safe ASI for everyone’s sake!” to “ok well I guess I’ll build ASI that just obeys me, which I can do faster, but good enough since I’m not that evil, and also it’ll have a 95% chance of killing everyone but that’s better than the competition’s 96% which I just estimated out of my own ass”. Race to the bottom, and everything ends up thrown under the bus except the tiniest most unrecognizable sliver of your original goals.