This is the best I’ve got so far. I estimated the rating using the midpoint of a logistic regression fit to the games. The first few especially seem to have been inflated due to not having enough high rated players in the data, so it had to extrapolate. And they all seem inflated by (I’d guess) a couple of hundred points due to the effects I mentioned in the post. (Edit: Please don’t share the graph alone without this context).
The NN rating in the Blitz data highlights the flaw in this method of estimating the rating.
I haven’t found a way to get similar data on human vs human games.
This is the best I’ve got so far. I estimated the rating using the midpoint of a logistic regression fit to the games. The first few especially seem to have been inflated due to not having enough high rated players in the data, so it had to extrapolate. And they all seem inflated by (I’d guess) a couple of hundred points due to the effects I mentioned in the post. (Edit: Please don’t share the graph alone without this context).
The NN rating in the Blitz data highlights the flaw in this method of estimating the rating.
I haven’t found a way to get similar data on human vs human games.