The paper does attempt to adjust for this with a complexity metric although I suspect this doesn’t work perfectly as it seems to be a linear adjustment with number of nodes used by the engine to calculate the optimal move.
I have a concern that the paper is comparing tournament play (offline) to match play with 4 games per match (online). In match play, especially with few games, a player who is behind needs to force the game and the player in the lead can play more conservatively. Tournaments have their own incentives but overall I would expect short match play to cause bigger errors from engine optimal play as losing players try to force a win in naturally drawing situations.
The calculated effect size is >200 ELO points which suggests to me that something is amiss.
The paper does attempt to adjust for this with a complexity metric although I suspect this doesn’t work perfectly as it seems to be a linear adjustment with number of nodes used by the engine to calculate the optimal move.
I have a concern that the paper is comparing tournament play (offline) to match play with 4 games per match (online). In match play, especially with few games, a player who is behind needs to force the game and the player in the lead can play more conservatively. Tournaments have their own incentives but overall I would expect short match play to cause bigger errors from engine optimal play as losing players try to force a win in naturally drawing situations.
The calculated effect size is >200 ELO points which suggests to me that something is amiss.