What are your thoughts on the argument that advancing capabilities could help make us safer?
In order to do alignment research, we need to understand how AGI works; and we currently don’t understand how AGI works, so we need to have more capabilities research so that we would have a chance of figuring it out. Doing capabilities research now is good because it’s likely to be slower now than it might be in some future where we had even more computing power, neuroscience understanding, etc. than we do now. If we successfully delayed capabilities research until a later time, then we might get a sudden spurt of it and wouldn’t have the time to turn our increased capabilities understanding into alignment progress. Thus by doing capabilities research now, we buy ourselves a longer time period in which it’s possible to do more effective alignment research.
Aside: More Sensible Cousin of Bad Argument 1 [but I still strongly disagree with it]: Three steps: “(A) Endgame safety is really the main thing we should be thinking about; (B) Endgame safety will be likelier to succeed if it happens sooner than if it happens later [for one of various possible reasons]; (C) Therefore let’s make the endgame happen ASAP.”
I mainly don’t like this argument because of (A), as elaborated in the next section.
But I also think (B) is problematic. First, why might someone believe (B)? The main argument I’ve seen revolves around the idea that making the endgame happen ASAP minimizes computing overhang, and thus slow takeoff will be likelier, which in turn might make the endgame less of a time-crunch.
I mostly don’t buy this argument because I think it relies on guessing what bottlenecks will hit in what order on the path to AGI—not only for capabilities but also for safety. That kind of guessing seems very difficult. For example, Kaj mentions “more neuroscience understanding” as bad because it shortens the endgame, but couldn’t Kaj have equally well said that “more neuroscience understanding” is exactly the thing we want right now to make the endgame start sooner??
As another example, this is controversial, but I don’t expect that compute will be a bottleneck to fast takeoff, and thus I don’t expect future advances in compute to meaningfully speed the Endgame. I think we already have a massive compute overhang, with current technology, once we know how to make AGI at all. The horse has already left the barn, IMO. See here.
So anyway, I see this whole (B) argument as quite fragile, even leaving aside the (A) issues. And speaking of (A), that brings us to the next item: …
I don’t want to speak for Nate but I’m guessing that he would at least share my objection to (A) on the basis of his DTD post.
What are your thoughts on the argument that advancing capabilities could help make us safer?
I’m not Nate but I have a response here.
I don’t want to speak for Nate but I’m guessing that he would at least share my objection to (A) on the basis of his DTD post.
(confirmed)