I think he somewhat answers your point here:
And as a result, our sort of meta strategy involves essentially roughly forecasting, maybe not formally forecasting, but roughly in our heads having some sense of what things are going to be potential problems over the next some time period — call it three months, maybe longer, maybe shorter, who knows — and identifying what sorts of considerations might become quite important during that time given the capabilities that we expect to have, and making sure that we’re prepared for those. And if we’re not prepared for that, then possibly slowing down, or pausing development, or talking to governments, trying to do advocacy.
But importantly, we’re not really trying to forecast arbitrarily far into the future all the problems that are going to arise with AI development. The goal isn’t “Know how to align ASI, or else do nothing.” Usually I think of us as looking at a time horizon of, it depends on which particular thing we’re doing, but often somewhere between three months and five years.
I interpret this as: we are focusing on short-term planning and ready to support a pause if we see imminent danger.
In other words, he probably is concerned with the alignment worries you mention may emerge in the future, but doesn’t consider them proximate enough to warrant present-day commitments.
I’ve always had an inkling that frontier labs would make increasingly rigorous safety communications as the threat became more salient. After all, most lab leaders seem to have recognized x-risk way before the AI boom and are presumably interested in continuing to exist.
And in general, I think a lot of the safety community tends to extrapolate from current lab safety attitudes and not price-in changes to communications and concerns that seem likely as capabilities, and accompanying unease, grow.
But I agree, this doesn’t feel satisfying for some reason. Was it necessary to downplay/gloss over safety and scare the shit out of everyone concerned about this sort of thing? And if it was, for political or financial reasons, it means there exist incentive structures working against transparency of knowledge/belief, which seems bad overall and for the long-run.
Side note—your link requires a login, not sure if that’s intended.