tailcalled comments on Price’s equation for neural networks

tailcalled 21 Dec 2022 13:49 UTC
2 points
0
Actually upon further thought for something like policy gradients, in the limit where the probability $p$ is close to $0$ , then $n t v a r (a)$ would probably be more like $O (p^{2})$ ? Because you get a factor of $p$ from the probability, and then an additional factor of $p (1 - p) = O (p)$ from the derivative of sigmoid/softmax, which adds up to it being $O (p^{2})$ .