casens comments on Shortform

casens 19 Oct 2025 18:04 UTC
3 points
0
your entire analysis is broken in that you assume that an elo rating is something objective like an atomic weight or the speed of light. in reality, an elo rating is an estimation of playing strength among a particular pool of players.

the problem that elo was trying to solve was, if you have players A and B, who have both played among players C through Q, but A and B have never played each other, can you concretely say whether A is stronger than B? the genius of the system is that you can, and in fact, the comparison of 2 scores gives you a probability of whether A will beat B in a game (if i recall correctly, a difference of +200 points implies an expected score of +0.75, where 1.0 is winning, 0 is losing, and 0.5 is a draw).

the elo system does not work, however, if there are 2 pools of non-overlapping players like C through M and N through Z, and A has only played in pool 1, and B only in pool 2. i’m fairly certain you could construct a series ~200 of exploitable chess bots, where A always beats B, B always beats C, etc, getting elo rankings almost arbitrarily high.

so a major problem with your analysis was that you cited Random as having an elo of 477, and indexed your other answers based on that, when actually, that bot had an elo of 477 against other terrible (humorous) bots. if you put Random into FIDE tournaments, i expect its elo would be much lower.