If Anyone Builds It could have been an explanation for why the MIRI worldview is still relevant nearly two decades later, in a world where we know so much more about AI. Instead, the authors spend all their time shadowboxing against opponents they’ve been bored of for decades, and fail to make their own case in the process.
Hm. I’m torn between thinking this is a sensible criticism and thinking that this is missing the point.
In my view, the core MIRI complaint about ‘gradualist’ approaches is that they are concrete solutions to abstract problems. When someone has misdiagnosed the problem, their solutions will almost certainly not work, and the question is just where they’ve swept the difficulty under the rug. Knowing so much more about AI as an engineering challenge while having made no progress on alignment the abstraction—well, the relevance of the MIRI worldview is obvious. “It’s hard, and if you think it’s easy you’re making a mistake.”
People attempting to solve AI seem overly optimistic about their chances of solving it, in a way consonant with them not understanding the problem they’re trying to solve, and not consonant with them having a solution that they’ve simply failed to explain to us. The book does talk about examples of this, and tho you might not like the examples (see, for example, Buck’s complaint that the book responds to the safety sketches of prominent figures like Musk and LeCun instead of the most thoughtful versions of those plans) I think it’s not obvious that they’re the wrong ones to be talking about. Musk is directing much more funding than Ryan Greenblatt is.
The arguments for why recent changes in AI have alignment implications have, I think, mostly failed. You may recall how excited people were about an advanced AI paradigm that didn’t involve RL. Of course, top-of-the-line LLMs are now trained in part using RL, because—obviously they would be? It was always cope to think they wouldn’t be? I think the version of this book that was written two years ago, and so spent a chapter on oracle AI because that would have been timely, would have been worse that the book that tried to be timeless and focused on the easy calls.
But the core issue from the point of view of the New York Times or the man on the street is not “well, which LessWrong poster is right about how accurately we can estimate the danger threshold, and how convincing our control schema will be as we approach it?”. It’s that the man on the street thinks things that are already happening are decades away, and even if they believed what the ‘optimists’ believe they would probably want to shut it all down. It’s like the virologists talking amongst themselves about the reasonable debate over whether or not to do gain-of-function research, and the rest of society looked in for a moment and said “what? Make diseases deadlier? Are you insane?”.
Hm. I’m torn between thinking this is a sensible criticism and thinking that this is missing the point.
In my view, the core MIRI complaint about ‘gradualist’ approaches is that they are concrete solutions to abstract problems. When someone has misdiagnosed the problem, their solutions will almost certainly not work, and the question is just where they’ve swept the difficulty under the rug. Knowing so much more about AI as an engineering challenge while having made no progress on alignment the abstraction—well, the relevance of the MIRI worldview is obvious. “It’s hard, and if you think it’s easy you’re making a mistake.”
People attempting to solve AI seem overly optimistic about their chances of solving it, in a way consonant with them not understanding the problem they’re trying to solve, and not consonant with them having a solution that they’ve simply failed to explain to us. The book does talk about examples of this, and tho you might not like the examples (see, for example, Buck’s complaint that the book responds to the safety sketches of prominent figures like Musk and LeCun instead of the most thoughtful versions of those plans) I think it’s not obvious that they’re the wrong ones to be talking about. Musk is directing much more funding than Ryan Greenblatt is.
The arguments for why recent changes in AI have alignment implications have, I think, mostly failed. You may recall how excited people were about an advanced AI paradigm that didn’t involve RL. Of course, top-of-the-line LLMs are now trained in part using RL, because—obviously they would be? It was always cope to think they wouldn’t be? I think the version of this book that was written two years ago, and so spent a chapter on oracle AI because that would have been timely, would have been worse that the book that tried to be timeless and focused on the easy calls.
But the core issue from the point of view of the New York Times or the man on the street is not “well, which LessWrong poster is right about how accurately we can estimate the danger threshold, and how convincing our control schema will be as we approach it?”. It’s that the man on the street thinks things that are already happening are decades away, and even if they believed what the ‘optimists’ believe they would probably want to shut it all down. It’s like the virologists talking amongst themselves about the reasonable debate over whether or not to do gain-of-function research, and the rest of society looked in for a moment and said “what? Make diseases deadlier? Are you insane?”.