mishka comments on Is SGD capabilities research positive?

mishka 13 Nov 2025 4:24 UTC
6 points
0
RL vs SGD does not seem to be a correct framing.

Very roughly speaking, RL is about what you optimize for (a subclass of what you can optimize for) and SGD is one of the many optimization methods (in particular, SGD and its cousins are highly useful in RL tasks (consider policy gradients and such)).