Is it a crazy coincidence that AlphaZero taught itself chess and explosively outperformed humans without any programmed knowledge of chess, then asymptoted out at almost exactly 2017 stockfish performance? I need to look into it more, but it appears like AlphaZero would curbstomp 2012 stockfish and get curbstomped in turn by 2025 stockfish.
It almost only makes sense if the entire growth in stockfish performance since 2017 is casually downstream of the AlphaZero paper.
There is a connection. Stockfish does use Leela Chess Zero (the open source, distributed training offspring of AlphaChessZero) training data for its own evaluation neural network. This NNUE is a big piece of Stockfish progress in the last few years.
It’s not straightforward to compare AlphaZeroChess and Stockfish though as the former is heavily GPU-dependent whereas the latter is CPU optimized. However, Google may have decided to train to a roughly comparable level (under some hardware assumptions) as a proof of concept and not bothered trying to advance much further.
Is it a crazy coincidence that AlphaZero taught itself chess and explosively outperformed humans without any programmed knowledge of chess, then asymptoted out at almost exactly 2017 stockfish performance? I need to look into it more, but it appears like AlphaZero would curbstomp 2012 stockfish and get curbstomped in turn by 2025 stockfish.
It almost only makes sense if the entire growth in stockfish performance since 2017 is casually downstream of the AlphaZero paper.
There is a connection. Stockfish does use Leela Chess Zero (the open source, distributed training offspring of AlphaChessZero) training data for its own evaluation neural network. This NNUE is a big piece of Stockfish progress in the last few years.
It’s not straightforward to compare AlphaZeroChess and Stockfish though as the former is heavily GPU-dependent whereas the latter is CPU optimized. However, Google may have decided to train to a roughly comparable level (under some hardware assumptions) as a proof of concept and not bothered trying to advance much further.
I guess the team kept iterating on/improving the RL algorithm and network until it beat all engines and then stopped?