gwern comments on Is training data going to be diluted by AI-generated content?