Interesting post! I’ve noticed that poker reasoning tends to be terrible, it’s not totally clear to me why. Pretraining should contain quite a lot of poker discussion, though I guess a lot of it is garbage. I think it could be pretty easily fixed in RL if anyone cared enough, but then it wouldn’t be a good test of general reasoning ability.
Interesting post! I’ve noticed that poker reasoning tends to be terrible, it’s not totally clear to me why. Pretraining should contain quite a lot of poker discussion, though I guess a lot of it is garbage. I think it could be pretty easily fixed in RL if anyone cared enough, but then it wouldn’t be a good test of general reasoning ability.
One nit: it’s “hole card”, not “whole card”.