Oliver Sourbut comments on Optimisation: Selective versus Predictive

Oliver Sourbut 13 May 2026 15:22 UTC
12 points
0
I wrote something about this a while back: in short, with a squint gradient descent and natural selection are the same.

From my point of view, one thing that’s particularly relevant is that they’re both operating locally, with very no/little foresight, over a high-dimensional design space. You could look at GD as selecting among all the possible local steps, and ‘competing’ them based on the heuristic of their local loss gradient (as approximated by the (sampled) dataset-derived estimator).

Some key practical differences between varying instantiations of GD/NS will be in the effective ‘proposal’/generating procedures and ‘promotion’/selection heuristics.
- cdt 13 May 2026 16:44 UTC
  1 point
  0
  Parent
  This confusion comes about because natural selection has no mechanism to maintain variation. Equivalently, gradient descent can only work with the data provided or in other words it has no “proposal” step like Gibbs sampling or MCMC. So the idea that gradient descent and natural selection are the same feels intuitive to me.
  It is also known that some models of evolutionary game theory recover Fisher’s theorem of natural selection as a consequence of the replicator equation (a model of natural selection) as a gradient flow, see this arxiv paper. [Might have bungled the explanation on this one, so take with some salt.]