I agree that safety people have lots of ideas more interesting than stack more layers, but they mostly seem irrelevant to progress. People working in AI capabilities also have plenty of such ideas, and one of the most surprising and persistent inefficiencies of the field is how consistently it overweights clever ideas relative to just spending the money to stack more layers. (I think this is largely down to sociological and institutional factors.)
Indeed, to the extent that AI safety people have plausibly accelerated AI capabilities I think it’s almost entirely by correcting that inefficiency faster than might have happened otherwise, especially via OpenAI’s training of GPT-3. But this isn’t a case of safety people incidentally benefiting capabilities as a byproduct of their work, it was a case of some people who care about safety deliberately doing something they thought would be a big capabilities advance. I think those are much more plausible as a source of acceleration!
(I would describe RLHF as pretty prototypical: “Don’t be clever, just stack layers and optimize the thing you care about.” I feel like people on LW are being overly mystical about it.)
I agree that safety people have lots of ideas more interesting than stack more layers, but they mostly seem irrelevant to progress. People working in AI capabilities also have plenty of such ideas, and one of the most surprising and persistent inefficiencies of the field is how consistently it overweights clever ideas relative to just spending the money to stack more layers. (I think this is largely down to sociological and institutional factors.)
Indeed, to the extent that AI safety people have plausibly accelerated AI capabilities I think it’s almost entirely by correcting that inefficiency faster than might have happened otherwise, especially via OpenAI’s training of GPT-3. But this isn’t a case of safety people incidentally benefiting capabilities as a byproduct of their work, it was a case of some people who care about safety deliberately doing something they thought would be a big capabilities advance. I think those are much more plausible as a source of acceleration!
(I would describe RLHF as pretty prototypical: “Don’t be clever, just stack layers and optimize the thing you care about.” I feel like people on LW are being overly mystical about it.)