Zac says “Yes, over the course of training AlphaZero learns many concepts (and develops behaviours) which have clear correspondence with human concepts.”
What’s the evidence for this? If AlphaZero worked by learning concepts in a sort of step-wise manner, then we should expect jumps in performance when it comes to certain types of puzzles, right? I would guess that a beginning human would exhibit jumps from learning concepts like “control the center” or “castle early, not later”.. for instance the principle “control the center”, once followed, has implications on how to place knights etc which greatly effect win probability. Is the claim they found such jumps? (eyeing the results nothing really stands out in the plots).
Or is the claim that the NMF somehow proves that AlphaZero works off concepts? To me that seems suspicious as NMF is looking at weight matrices at a very crude level, it seems.
I ask this partially because I went to a meetup talk (not recorded sadly) where a researcher from MIT showed a go problem that alphaGo can’t solve but which even beginner go players can solve, which shows that alphaGo actually doesn’t understand things the same way as humans. Hopefully they will publish their work soon so I can show you.