That would only help with current architectures; RL-first architectures won’t give a crap about what the language pretraining had to say, they’re going to experiment with how to get what they want and they’ll notice that being shut down gets in the way.
That would only help with current architectures; RL-first architectures won’t give a crap about what the language pretraining had to say, they’re going to experiment with how to get what they want and they’ll notice that being shut down gets in the way.