This fundamental problem generalizes to every resulting position; the tablebase can’t distinguish between getting a position that a grandmaster would judge as ‘notably better with good winning chances’ and a position which would be judged as ‘horrible and very hard to hold in practice’ (so long as both of those positions would end in a draw with two 32-piece tablebases playing against each other).
From this it seems rather obvious that if our tablebase picks at random among drawing moves, it would be unable to win[4]against, say, Stockfish 17 at depth 20 from the starting position (with both colors).
Suppose the tablebase selected randomly from drawing moves, when presented with a drawing position. And the initial position is a drawing position. Then the table base either wins or draws. You can see this by thinking about the definitions.
It’s relatively easy to define optimal chess by induction, by the min-max algorithm.
You’re correct that for a suboptimal policy P, the policy Q which scores the best against P might not be an optimal play.
Of course. At no point did I suggest that it could lose. The ‘horrible and very hard to hold in practice’ was referring to the judgement of a hypothetical grandmaster, though I’m not sure if you were referring to that part.
”It’s relatively easy to define optimal chess by induction, by the min-max algorithm.” Once again, I agree. I failed to mention what I see as an obvious implication of my line of reasoning. Namely that optimal play (with random picking among drawing moves) would have a pretty unimpressive Elo [1](way lower than your estimates/upper bounds), one bounded by the Elo of the opponent/s. So: If we pit it against different engines in a tournament, I would expect the draw rate to be ~100% and the resulting Elo to be (in expectation) ever so slightly higher than the average rating of the engines it’s playing against. If we pit it against grandmasters I think similar reasoning applies (I’d expect the draw rate to be ~97-99%). You can extend this further to club-players, casual players, patzers and I would expect the draw rate to drop off, yes, but still remain high. Which suggests that optimal play (with random picking among drawing moves) would underperform Stockfish 17 by miles, since Stockfish could probably achieve a win rate of >99% against basically any group of human opponents.
There are plenty of algorithms which are provably optimal (minimax-wise) some of which would play very unimpressively in practice (like our random-drawn-move 32-piece tablebase) and some which could get a very high Elo estimaiton in ~all contexts. For example: If the position is won, use the 32-piece tablebase Same if the position is lost If the position is drawn, use Stockfish 17 at depth 25 to pick from the set of drawing moves.
This is optimal too, and would perform way better but that definition is quite inelegant. And the thing that I was trying to get at by asking about the specific definition, is that there is an astronomically large amount of optimal play algorithms, some of which could get a very low Elo in some contexts and some which could get a very high Elo irrespective of context. So when you write ‘What’s the Elo rating of optimal chess?‘, it seems reasonable to ask ‘Which optimal algorithm exactly?’.
Suppose the tablebase selected randomly from drawing moves, when presented with a drawing position. And the initial position is a drawing position. Then the table base either wins or draws. You can see this by thinking about the definitions.
It’s relatively easy to define optimal chess by induction, by the min-max algorithm.
You’re correct that for a suboptimal policy P, the policy Q which scores the best against P might not be an optimal play.
Of course. At no point did I suggest that it could lose. The ‘horrible and very hard to hold in practice’ was referring to the judgement of a hypothetical grandmaster, though I’m not sure if you were referring to that part.
”It’s relatively easy to define optimal chess by induction, by the min-max algorithm.”
Once again, I agree. I failed to mention what I see as an obvious implication of my line of reasoning. Namely that optimal play (with random picking among drawing moves) would have a pretty unimpressive Elo [1](way lower than your estimates/upper bounds), one bounded by the Elo of the opponent/s.
So:
If we pit it against different engines in a tournament, I would expect the draw rate to be ~100% and the resulting Elo to be (in expectation) ever so slightly higher than the average rating of the engines it’s playing against.
If we pit it against grandmasters I think similar reasoning applies (I’d expect the draw rate to be ~97-99%).
You can extend this further to club-players, casual players, patzers and I would expect the draw rate to drop off, yes, but still remain high. Which suggests that optimal play (with random picking among drawing moves) would underperform Stockfish 17 by miles, since Stockfish could probably achieve a win rate of >99% against basically any group of human opponents.
There are plenty of algorithms which are provably optimal (minimax-wise) some of which would play very unimpressively in practice (like our random-drawn-move 32-piece tablebase) and some which could get a very high Elo estimaiton in ~all contexts. For example:
If the position is won, use the 32-piece tablebase
Same if the position is lost
If the position is drawn, use Stockfish 17 at depth 25 to pick from the set of drawing moves.
This is optimal too, and would perform way better but that definition is quite inelegant. And the thing that I was trying to get at by asking about the specific definition, is that there is an astronomically large amount of optimal play algorithms, some of which could get a very low Elo in some contexts and some which could get a very high Elo irrespective of context. So when you write ‘What’s the Elo rating of optimal chess?‘, it seems reasonable to ask ‘Which optimal algorithm exactly?’.
And very unimpressive level of play in drawn positions.