aog comments on Paper: Large Language Models Can Self-improve [Linkpost]

aog 2 Oct 2022 17:43 UTC
18 points
0
This paper renames the decades-old method of semi-supervised learning as self-improvement. Semi-supervised learning does enable self-improvement, but not mentioning the hundreds of thousands of papers that previously used similar methods obscures the contribution of this paper.
Here’s a characteristic example of the field, Berthelot et al., 2019 training an image classifier on the sharpened average of its own predictions for multiple data augmentations of an unlabeled input. Another example would be Cho et al., 2019 who train a language model to generate paraphrases of an input sequence by generating many candidate paraphrases and then selecting only those whose confidence passes a specified threshold. As wunan mentioned, AlphaFold is also semi-supervised. Google Scholar lists 174,000 results for semi-supervised learning.
The authors might protest that this isn’t semi-supervised learning because it’s training on explanations that are known to generate the correct answer as determined by ground truth labels, rather than training on high confidence answers of unknown ground truth status. That’s an important distinction, and I don’t know if anybody has used this particular method before. But this work seems much more similar to previous work on semi-supervised learning than they’re letting on, and I think it’s important to appreciate the context.
- Quintin Pope 3 Oct 2022 15:18 UTC
  5 points
  2
  Parent
  This is not quite correct. The approach in the paper isn’t actually semi-supervised, because they do not use any ground truth labels at all. In contrast, semi-supervised approaches use a small initial set of ground truth labels, then use a larger collection of unlabeled data to further improve the model’s capabilities.
- [ ]
  [deleted]