gwern comments on Large language models learn to represent the world

gwern 24 Jan 2023 1:38 UTC
2 points
0

That wouldn’t have been appropriate for the work we’re discussing, though, since the whole point was to determine whether a transformer trained only on the moves will learn to have an internal representation of the board state, which in turn is suggestive of whether a much larger transformer trained only on text will learn to have an internal representation of the world that the text is about.

Sure, I’m not saying they should’ve done that instead. In addition, but probably they didn’t have the time/energy. My point is just that the illegal-move error rate is ambiguous if you (gjm) are interested in whether it has perfectly learned the rules (which is different from what the authors are going after), because there are sources of error beyond “it has failed to learn the rules”, like errors reconstructing the board state leading to misapplication of potentially-perfectly-learned rules. To my eyes, a legal move error rate as low as 0.01% in this setup, given the burden of state reconstruction in a unnatural and difficult way, strongly suggests it’s actually doing a great job of learning the rules. I predict that if you set it up in a way which more narrowly targeted rule learning (eg behavior cloning: just mapping full game state->expert-action, no history at all), you would find that its illegal move rate would approach 0% much more closely, and you’d have to find some really strange edge-cases like my chess promotion examples to trip it up, (at which point one would be satisfied, because how would one ever learn those unobserved things offline without priors).
- gjm 24 Jan 2023 12:10 UTC
  4 points
  0
  Parent
  I agree that the network trained on the large random-game dataset shows every sign of having learned the rules very well, and if I implied otherwise then that was an error. (I don’t think I ever intended to imply otherwise.)
  The thing I was more interested in was the difference between that and the network trained on the much smaller championship-game dataset, whose incorrect-move rate is much much higher—about 5%. I’m pretty sure that either (1) having a lot more games of that type would help a lot or (2) having a bigger network would help a lot or (3) both; my original speculation was that 2 was more important but at that point I hadn’t noticed just how big the disparity in game count was. I now think it’s probably mostly 1, and I suspect that the difference between “random games” and “well played games” is not a major factor, and in particular I don’t think it’s likely that seeing only good moves is leading the network to learn a wrong ruleset. (It’s definitely not impossible! It just isn’t how I’d bet.)
  Vaniver’s suggestion was that the championship-game-trained network had learned a wrong ruleset on account of some legal moves being very rare. It doesn’t seem likely to me that this (as opposed to 1. not having learned very well because the number of games was too small and/or 2. not having learned very well because the positions in the championship games are unrepresentative) is the explanation for having illegal moves as top prediction 5% of the time.
  It looked as if you were disagreeing with that, but the arguments you’ve made in support all seem like cogent arguments against things other than what I was intending to say, which is why I think that at least one of us is misunderstanding the other.
  In particular, at no point was I saying anything about the causes of the nonzero but very small error rate (~0.01%) of the network trained on the large random-game dataset, and at no point was I saying that that network had not done an excellent job of learning the rules.