I think we can clearly conclude that cortex doesn’t do what NNs do, because cortex is incapable to learn conditioned response, it’s an uncontested fiefdom of cerebellum, while for NNs learning conditioned response is the simplest thing to do. It also crushes hypothesis of Hebbian rule. I think majority of people in neurobiology neighbourhood haven’t properly updated on this fact.
We now have several different architectures that reach parity with but do not substantially exceed transformer: RWKV (RNN), xLSTM, Mamba, Based, etc. This implies they have a shared bottleneck and most gains are from scaling.
It can also imply that shared bottleneck is a property of overall approach.
is simpler than the manifest known mechanisms of self attention, multi-layer perceptron, backprop and gradient descent
I don’t know where you get “simpler”. Description of each thing you mentioned can fit in what, paragraph, page? I don’t think that Steven expects description of “simple core of intelligence” to be shorter than paragraph with description of backprop.
Beren Millidge is and he’s written that “it is very clear that ML models have basically cracked many of the secrets of the cortex”
I guess if you look at brain at sufficiently coarse-grained level, you would discover that lots of parts of brain perform something like generalized linear regression. It would be less a fact about brain and more a fact about reality: generalized linear dependencies are everywhere, it’s useful to learn them. It’s reasonable that brain also learns what transformer learns. It doesn’t mean that it’s the only thing brain learns.
Sure “The Cerebellum Is The Seat of Classical Conditioning.” But I’m not sure it’s the only one. Delay eyeblink conditioning is cerebellar-dependent, which we know because of lesion studies. This does not generalize to all conditioned responses:
Trace eyeblink conditioning requires hippocampus and medial prefrontal cortex in addition to cerebellum (Takehara 2003).
Fear conditioning is driven by the amygdala, not cerebellum.
Hebbian plasticity isn’t crushed by cerebellar learning. Cerebellum long-term depression is a timing‐sensitive variant of Hebb’s rule (van Beugen et al. 2013).
I think we can clearly conclude that cortex doesn’t do what NNs do, because cortex is incapable to learn conditioned response, it’s an uncontested fiefdom of cerebellum, while for NNs learning conditioned response is the simplest thing to do. It also crushes hypothesis of Hebbian rule. I think majority of people in neurobiology neighbourhood haven’t properly updated on this fact.
It can also imply that shared bottleneck is a property of overall approach.
I don’t know where you get “simpler”. Description of each thing you mentioned can fit in what, paragraph, page? I don’t think that Steven expects description of “simple core of intelligence” to be shorter than paragraph with description of backprop.
I guess if you look at brain at sufficiently coarse-grained level, you would discover that lots of parts of brain perform something like generalized linear regression. It would be less a fact about brain and more a fact about reality: generalized linear dependencies are everywhere, it’s useful to learn them. It’s reasonable that brain also learns what transformer learns. It doesn’t mean that it’s the only thing brain learns.
what! big if true. what papers originated this claim for you?
Here are lots of links.
Sure “The Cerebellum Is The Seat of Classical Conditioning.” But I’m not sure it’s the only one. Delay eyeblink conditioning is cerebellar-dependent, which we know because of lesion studies. This does not generalize to all conditioned responses:
Trace eyeblink conditioning requires hippocampus and medial prefrontal cortex in addition to cerebellum (Takehara 2003).
Fear conditioning is driven by the amygdala, not cerebellum.
Hebbian plasticity isn’t crushed by cerebellar learning. Cerebellum long-term depression is a timing‐sensitive variant of Hebb’s rule (van Beugen et al. 2013).
What? This isn’t my understanding at all, and a quick check with an LLM also disputes this.