JBlack comments on Language models seem to be much better than humans at next-token prediction

JBlack 12 Aug 2022 1:32 UTC
5 points
0
When I played with this on the website, it seemed to have a few major bugs. At one point I was 20% ahead of the best language model score, made one prediction at 30%, and for some reason lost 400 points and the feedback displayed wasn’t even for my sentence!
- Fabien Roger 12 Aug 2022 2:18 UTC
  3 points
  0
  Parent
  I’m sorry the feedback wasn’t displayed! I didn’t hear the players complain about this issue during the measurements, so it’s probably an uncommon bug. Anyway, this shouldn’t have happened.
  It’s not surprising that you lost 400 points in one question (even with p= 30%). If the generative language model thinks the correct token was very likely, you will lose a lot of points if you fail to select it (otherwise you wouldn’t be incentivized to give your true probability estimates, see Appendix B).
  - JBlack 13 Aug 2022 6:36 UTC
    2 points
    0
    Parent
    The feedback was displayed, but it wasn’t for my sentence. It was for a different sentence, with different word choices.