Actually, I think you’re right. I always thought that MuZero was one and the same system for every game, but the Nature paper describes it as an architecture that can be applied to learn different games. I’d like a confirmation from someone who actually studied it more, but it looks like MuZero indeed isn’t the same system for each game.
Yep, they’re different. It’s just an architecture. Among other things, Chess and Go have different input/action spaces, so the same architecture can’t be used on both without some way to handle this.
This paper uses an egocentric input, which allows many different types of tasks to use the same architecture. That would be the equivalent of learning Chess/Go based on pictures of the board.
Didn’t they train a separate MuZero agent for each game? E.g. the page you link only talks about being able to learn without pre-existing knowledge.
Actually, I think you’re right. I always thought that MuZero was one and the same system for every game, but the Nature paper describes it as an architecture that can be applied to learn different games. I’d like a confirmation from someone who actually studied it more, but it looks like MuZero indeed isn’t the same system for each game.
Yep, they’re different. It’s just an architecture. Among other things, Chess and Go have different input/action spaces, so the same architecture can’t be used on both without some way to handle this.
This paper uses an egocentric input, which allows many different types of tasks to use the same architecture. That would be the equivalent of learning Chess/Go based on pictures of the board.