I have a pet theory (e.g. here) that the cerebellum is a subsystem that exists for the sole purpose of mitigating how slow other parts of the brain (especially cortex) are. Basically, it memorizes patterns like “under such-and-such circumstance, the cortex is going to send a signal down this particular output line”. A.k.a. supervised learning. Then when it sees that “such-and-such circumstance”, it preemptively sends the same signal itself. There are lots of cortical output lines (including both cortex-to-cortex signals and cortex-to-brainstem/muscles signals) so the cerebellum winds up being pretty big. Also, the way supervised learning works is, the more different contextual information you feed as an input, the more accurate a mimicker the cerebellum can be. If an elephant brain is unusually slow, that would seem to call for an unusually accurate and comprehensive and fast-learning cerebellum, I guess. For example, if it screws up the preemption for a leg motion, then the leg will be moving incorrectly for some substantial amount of time before the motor cortex can belatedly send a better signal, and the animal is liable to trip and fall in the meantime. Or maybe it’s cost rather than benefit: i.e., an unusually accurate and comprehensive and fast-learning cerebellum would be beneficial for any animal, but only big animals can afford the extra weight.
I don’t know off the top of my head if an elephant brain is in fact slow. (Seems plausible.) My vague memory is that axons can be faster or slower depending on thickness and myelination.
There also might be some tradeoff between “per-neuron metabolic cost” and how squished together everything is, such that a less-space-constrained animal would benefit from having a physically-larger brain doing the same amount of processing with the same number of neurons. This page suggests that the number of non-cerebellar neurons in the elephant brain is a mere 6 billion…
Your sensory processing theory would be checkable by looking at the relative size of different parts of a blue whale brain etc. I haven’t done that, seems like an interesting thing to look into.
As for the elephant’s oversized cerebellum, I’ve heard it suggested that it’s for controlling the trunk. Elephant trunks are able to manipulate things with extreme dexterity, allowing them to pluck individual leaves, pick up heavy objects, or even paint. Since the cerebellum is known for “smoothing out” fine motor control (basically acting as a giant library of learned reflexes [including cognitive reflexes], as I understand it), it makes sense that elephant cerebellums would become so large as their trunks evolved to act as prehensile appendages.
According to this, the human brain has about 15 billion neurons in the telencephalon (neocortex, etc.), 70 billion in the cerebellum, and 1 billion in the brainstem. So it sounds like we still have much more circuitry dedicated to generalized abstract intelligence than elephants; they just have better dexterity with a more complex appendage than human limbs (minus the fingers but plus a ton of complexly interacting muscles). If we had cerebella closer in size to the elephant’s, all humans would probably be experts in gymnastics, martial arts, painting, and playing musical instruments.
The “cerebellum is for fine motor control” it now long out of date and has been decisively disproven—i’m not going to link all the relevant articles to back that up in this comment—but will in a future update to an earlier brain architecture post.
The cerebellum is compartmentalized into feed-forward modules that are not much connected to each other, but instead are tightly connected to corresponding cortical regions through thalamic relay, and thus also to corresponding basal ganglia regions (and perhaps more).
The cerebellum is crucially involved in nearly everything the cortex does, as the two are not even functionally distinct, and in general the brain is best understand as a collection of tightly coupled BG-thalamic-cortex-cerebellum recurrent processing modules, each of which has different types of cross local connectivity across modules in the different brain structures the loops traverse.
While fine motor control is certainly far from all that the cerebellum does, it is also certainly something that it does really do (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4347949/). From my understanding, it learns to smooth out motor trajectories (as well as generalized trajectories in cortical processing) using feed-forward circuitry (with feedback of error signals from the inferior olive acting as just a training signal), which is why I called it “learned reflexes”. And, as you mentioned, this feed-forward “reflexive” trajectory-smoothing extends to all cognitive processes.
I have come to see the basal ganglia as helping to decide what actions to take (or where information is routed to), while the cerebellum handles more how the actions are carried out (or how to adjust the information transferred between cortical regions). And it has all the computational universality of extremely wide feed-forward neural networks (https://arxiv.org/abs/1709.02540) due to its huge number of neurons. Maybe this would play into your idea in your other comment about how cerebellar outputs might also help train the cortex.
Actually I think it’s unclear if the cerebellum does anything on it’s own—even motor control, because as explained above from a connectivity standpoint it doesn’t even make sense to talk about cerebellum outside of tightly coupled cortical-cerebellar-BG-thalami loop modules—those are the actual functional connectivity modules of the brain. The brain is essentially a large society of those modules. Humans lacking a cerebellum (or even with sudden cerebellum damage) do not completely lack any specific motor function, but instead show a wide range of mental deficits. But humans lacking motor cortex do show motor deficits (from what I recall).
However, the cerebellum as training/reversion module theories do generally predict that the cerebellum is much more important for motor vs sensor cortex, just due to the nature of the sensor → latent → motor process which is compressive (and thus mostly UL based) in the first stage and then expansive (and thus mostly RL based) in the latter stage. And those theories specifically correctly predict lack of cerebellum mapping in earliest sensor cortex (as there is no need to send training signal to retina) - which is exactly what you find. Other theories do not make these specific correct testable predictions.
That being said yes agreed it probably “learns to smooth out motor trajectories” probably as a subset of learning to backdrop fine grain credit assignment through time for the cortex. (much more important for motor than sensor).
And yes agreed on the BG role.
As for ‘huge number of neurons’ - this is a red herring. The unit of computation is the synapse, not the neuron. The tiny granule cells only have 3-4 synapses each, and it’s pretty clear they are doing some decompression/recoding of incoming channel/bandwidth constrained signals. Regardless their contribution to total compute power is minor—not even in the trillions of synapses.
Whoops. When researching this post I misread “cerebrum” as “cerebellum”. That’s embarrassing. Thank you for the correction. I have made a note in the original post.
I think the “cerebellum as faster feed-forward distillation of recurrent cortex” is an interesting possibility, but the cortex also does distillation itself through hippocampal relay, has fast feedforward modes, and so I recently started putting more likelihood in the idea that the cerebellum is instead part of the learning system that helps train the cortex, in particular assisting with historical credit assignment by learning some approximate inversion TD style or otherwise.
There are a number of interesting general proposals in the “how the brain implements backprop through time” literature, and some of the more interesting recent ones involve the combination of diffuse non specific reward signals (ie dopamine and serotonin projections) and specific learned inversions working together to provide BP quality credit assignment or possibly even better (as you aren’t constrained to a 1st order gradient approximation).
All that said, the brain is definitely redundant, and the cortex implements reasonably powerful UL all on it’s own (eg hierarchical sparse coding), but it’s pretty clear it probably also employs something at least as good (or likely better) than gradient backprop, and learned inversions are a leading candidate. And as they are trained through a tight timing sensitive distillation process on a large data set it makes sense to use a big feedfoward layer, and this also explains why the lowest sensory cortex modules (ie V1) are the only cortical regions that lack supporting cerebellum modules .
I should point out though that these aren’t even necessarily distinct computations—because both involve learning a form of predictive temporal distillation—whether it’s predicting the output or predicting some training signal of the output.
(The piriform cortex is not in the cerebellum…)
I have a pet theory (e.g. here) that the cerebellum is a subsystem that exists for the sole purpose of mitigating how slow other parts of the brain (especially cortex) are. Basically, it memorizes patterns like “under such-and-such circumstance, the cortex is going to send a signal down this particular output line”. A.k.a. supervised learning. Then when it sees that “such-and-such circumstance”, it preemptively sends the same signal itself. There are lots of cortical output lines (including both cortex-to-cortex signals and cortex-to-brainstem/muscles signals) so the cerebellum winds up being pretty big. Also, the way supervised learning works is, the more different contextual information you feed as an input, the more accurate a mimicker the cerebellum can be. If an elephant brain is unusually slow, that would seem to call for an unusually accurate and comprehensive and fast-learning cerebellum, I guess. For example, if it screws up the preemption for a leg motion, then the leg will be moving incorrectly for some substantial amount of time before the motor cortex can belatedly send a better signal, and the animal is liable to trip and fall in the meantime. Or maybe it’s cost rather than benefit: i.e., an unusually accurate and comprehensive and fast-learning cerebellum would be beneficial for any animal, but only big animals can afford the extra weight.
I don’t know off the top of my head if an elephant brain is in fact slow. (Seems plausible.) My vague memory is that axons can be faster or slower depending on thickness and myelination.
There also might be some tradeoff between “per-neuron metabolic cost” and how squished together everything is, such that a less-space-constrained animal would benefit from having a physically-larger brain doing the same amount of processing with the same number of neurons. This page suggests that the number of non-cerebellar neurons in the elephant brain is a mere 6 billion…
Your sensory processing theory would be checkable by looking at the relative size of different parts of a blue whale brain etc. I haven’t done that, seems like an interesting thing to look into.
As for the elephant’s oversized cerebellum, I’ve heard it suggested that it’s for controlling the trunk. Elephant trunks are able to manipulate things with extreme dexterity, allowing them to pluck individual leaves, pick up heavy objects, or even paint. Since the cerebellum is known for “smoothing out” fine motor control (basically acting as a giant library of learned reflexes [including cognitive reflexes], as I understand it), it makes sense that elephant cerebellums would become so large as their trunks evolved to act as prehensile appendages.
According to this, the human brain has about 15 billion neurons in the telencephalon (neocortex, etc.), 70 billion in the cerebellum, and 1 billion in the brainstem. So it sounds like we still have much more circuitry dedicated to generalized abstract intelligence than elephants; they just have better dexterity with a more complex appendage than human limbs (minus the fingers but plus a ton of complexly interacting muscles). If we had cerebella closer in size to the elephant’s, all humans would probably be experts in gymnastics, martial arts, painting, and playing musical instruments.
The “cerebellum is for fine motor control” it now long out of date and has been decisively disproven—i’m not going to link all the relevant articles to back that up in this comment—but will in a future update to an earlier brain architecture post.
The cerebellum is compartmentalized into feed-forward modules that are not much connected to each other, but instead are tightly connected to corresponding cortical regions through thalamic relay, and thus also to corresponding basal ganglia regions (and perhaps more).
The cerebellum is crucially involved in nearly everything the cortex does, as the two are not even functionally distinct, and in general the brain is best understand as a collection of tightly coupled BG-thalamic-cortex-cerebellum recurrent processing modules, each of which has different types of cross local connectivity across modules in the different brain structures the loops traverse.
While fine motor control is certainly far from all that the cerebellum does, it is also certainly something that it does really do (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4347949/). From my understanding, it learns to smooth out motor trajectories (as well as generalized trajectories in cortical processing) using feed-forward circuitry (with feedback of error signals from the inferior olive acting as just a training signal), which is why I called it “learned reflexes”. And, as you mentioned, this feed-forward “reflexive” trajectory-smoothing extends to all cognitive processes.
I have come to see the basal ganglia as helping to decide what actions to take (or where information is routed to), while the cerebellum handles more how the actions are carried out (or how to adjust the information transferred between cortical regions). And it has all the computational universality of extremely wide feed-forward neural networks (https://arxiv.org/abs/1709.02540) due to its huge number of neurons. Maybe this would play into your idea in your other comment about how cerebellar outputs might also help train the cortex.
Actually I think it’s unclear if the cerebellum does anything on it’s own—even motor control, because as explained above from a connectivity standpoint it doesn’t even make sense to talk about cerebellum outside of tightly coupled cortical-cerebellar-BG-thalami loop modules—those are the actual functional connectivity modules of the brain. The brain is essentially a large society of those modules. Humans lacking a cerebellum (or even with sudden cerebellum damage) do not completely lack any specific motor function, but instead show a wide range of mental deficits. But humans lacking motor cortex do show motor deficits (from what I recall).
However, the cerebellum as training/reversion module theories do generally predict that the cerebellum is much more important for motor vs sensor cortex, just due to the nature of the sensor → latent → motor process which is compressive (and thus mostly UL based) in the first stage and then expansive (and thus mostly RL based) in the latter stage. And those theories specifically correctly predict lack of cerebellum mapping in earliest sensor cortex (as there is no need to send training signal to retina) - which is exactly what you find. Other theories do not make these specific correct testable predictions.
That being said yes agreed it probably “learns to smooth out motor trajectories” probably as a subset of learning to backdrop fine grain credit assignment through time for the cortex. (much more important for motor than sensor).
And yes agreed on the BG role.
As for ‘huge number of neurons’ - this is a red herring. The unit of computation is the synapse, not the neuron. The tiny granule cells only have 3-4 synapses each, and it’s pretty clear they are doing some decompression/recoding of incoming channel/bandwidth constrained signals. Regardless their contribution to total compute power is minor—not even in the trillions of synapses.
Whoops. When researching this post I misread “cerebrum” as “cerebellum”. That’s embarrassing. Thank you for the correction. I have made a note in the original post.
I think the “cerebellum as faster feed-forward distillation of recurrent cortex” is an interesting possibility, but the cortex also does distillation itself through hippocampal relay, has fast feedforward modes, and so I recently started putting more likelihood in the idea that the cerebellum is instead part of the learning system that helps train the cortex, in particular assisting with historical credit assignment by learning some approximate inversion TD style or otherwise.
There are a number of interesting general proposals in the “how the brain implements backprop through time” literature, and some of the more interesting recent ones involve the combination of diffuse non specific reward signals (ie dopamine and serotonin projections) and specific learned inversions working together to provide BP quality credit assignment or possibly even better (as you aren’t constrained to a 1st order gradient approximation).
All that said, the brain is definitely redundant, and the cortex implements reasonably powerful UL all on it’s own (eg hierarchical sparse coding), but it’s pretty clear it probably also employs something at least as good (or likely better) than gradient backprop, and learned inversions are a leading candidate. And as they are trained through a tight timing sensitive distillation process on a large data set it makes sense to use a big feedfoward layer, and this also explains why the lowest sensory cortex modules (ie V1) are the only cortical regions that lack supporting cerebellum modules .
I should point out though that these aren’t even necessarily distinct computations—because both involve learning a form of predictive temporal distillation—whether it’s predicting the output or predicting some training signal of the output.