Steven Byrnes comments on Evolution provides no evidence for the sharp left turn

Steven Byrnes 12 Apr 2023 14:25 UTC
7 points
2
I think forward-forward is basically a drop-in replacement for backprop: they’re both approaches to update a set of adjustable parameters / weights in a supervised-learning setting (i.e. when there’s after-the-fact ground truth for what the output should have been). FF might work better or worse than backprop, FF might be more or less parallelizable than backprop, whatever, I dunno. My guess is that the thing backprop is doing, it’s doing it more-or-less optimally, and drop-in-replacements-for-backprop are mainly interesting for better scientific understanding of how the brain works (the brain doesn’t use backprop, but also the brain can’t use backprop because of limitations of biological neurons, so that fact provides no evidence either way about whether backprop is better than [whatever backprop-replacement is used by the brain, which is controversial]). But even if FF will lead to improvements over backprop, it wouldn’t be the kind of profound change you seem to be implying. It would look like “hey now the loss goes down faster during training” or whatever. It wouldn’t be progress towards autonomous learning, right?
- Anomalous 12 Apr 2023 14:51 UTC
  5 points
  4
  Parent
  Tbc, my understanding of FF is “I watched him explain it on YT”. My scary-feeling is just based on feeling like it could get close to mimicking what the brain does during sleep, and that plays a big part of autonomous learning. Sleeping is not just about cycles of encoding and consolidation, it’s also about mysterious tricks for internally reorganising and generalising knowledge. And/or maybe it’s about confabulating sensory input as adversarial training data for learning to discern between real and imagined input. Either way, I expect there to be untapped potential for ANN innovation at the bottom, and “sleep” is part of it.
  One the other hand, if they don’t end up cracking the algorithms behind sleep and the like, this could be good wrt safety, given that I’m tentatively pessimistic about the potential of the leading paradigm to generalise far and learn to be “deeply” coherent.