plex comments on Meta “open sources” LMs competitive with Chinchilla, PaLM, and code-davinci-002 (Paper)

plex 25 Feb 2023 17:13 UTC
9 points
0
Meta’s previous LLM, OPT-175B, seemed good by benchmarks but was widely agreed to be much, much worse than GPT-3 (not even necessarily better than GPT-Neo-20b). It’s an informed guess, not a random dunk, and does leave open the possibility that they’re turned it around and have a great model this time rather than something which goodharts the benchmarks.
- cubefox 25 Feb 2023 18:19 UTC
  3 points
  0
  Parent
  Just a note, I googled a bit and couldn’t find anything regarding poor OPT-175B performance.
  - LawrenceC 25 Feb 2023 21:38 UTC
    4 points
    2
    Parent
    To back up plex a bit:
    It is indeed prevailing wisdom that OPT isn’t very good, despite being decent on becnhmarks, though generally the baseline comparison is to code-davinci-002 derived models (which do way better on benchmarks) or smaller models like UL2 that were trained with comparable compute and significantly more data.
    OpenAI noted in the original InstructGPT paper that performance on benchmarks can be un-correlated with human rater preference during finetuning.
    But yeah, I do think Eliezer is at most directionally correct—I suspect that LLaMA will see significant use amongst at least both researchers and Meta AI.