Thanks for this! I’ve changed the sentence to:
The target network gets to see one more step than the Q-network does, and thus is a better predictor.
Hopefully this prevents others from the same confusion :)
Thanks for this! I’ve changed the sentence to:
The target network gets to see one more step than the Q-network does, and thus is a better predictor.
Hopefully this prevents others from the same confusion :)