If you want a more accurate estimate of how often top chess engines pick the theoretical best move, you could compare Leelachess and stockfish. These are very close to each other ELO wise but have very different engines and styles of play. So you could look at how often they agree on the best move, and assume that both have some distribution where they pick their best moves from the the true move ranking, and then use that to calculate parameters of the distribution.
If you want a more accurate estimate of how often top chess engines pick the theoretical best move, you could compare Leelachess and stockfish. These are very close to each other ELO wise but have very different engines and styles of play. So you could look at how often they agree on the best move, and assume that both have some distribution where they pick their best moves from the the true move ranking, and then use that to calculate parameters of the distribution.