I feel like this is missing the point. To me, when I heard the description of neural nets as large hard-to-interpret matrices, this clicked with me. Why? Because I’d spent years studying the brain.
It’s not brain weights that make them more interpretable as systems. Heck no. All the points made about how biological networks are bad for interpretability are quite on point.
The reason I think brains are more interpretable at a system level is for two reasons:
Modularity
Look at the substantia nigra in the brain. That’s a module. It performs a critical, specific, hardwired task. The cortex is somewhat rewirable, but not very. And these modules are mostly located in the same places and performing mostly the same operations between people. Even between different species!
Limited rewirability
The patterns of function in a brain get much more locked in than in a neural network. If you were trying to hide your fear from a neuro scientist who had the ability to monitor your brain activity with implanted electrodes, could you do it? I think you couldn’t, even with practice. You could train yourself to feel less fear, but you couldn’t experience the same amount of fear but hide the activity in a different part of your brain. Even if you also got to monitor your own brain activity and practice. I just don’t think that the brain is rewirable enough to manage that.
Some stuff you could move or obfuscate, a bit, but it would be really hard and only partially successful.
So when people like Conjecture talk about building Cognitive Emulations that have highly verifiable functionality… I think that’s an awesome idea! I think that will work! (If given enough time and resources, which I fear it won’t be) And i think they should use big matrices of numbers to build them, but those matrices should be connected up to each other in specific hardcoded ways with deliberate information bottlenecks.
I do not endorse spiking neural nets as inherently more interpretable. I endorse highly modular systems with deliberate restrictions on their ability to change their wiring.
I agree that modularity / information bottlenecks are desirable and compatible with connectionism, and a promising way of making a more interpretable AI. Separating agents out into world models and value functions, for instance, which don’t share gradients, likely will result in more easily interpretable systems than systems share gradients through both them. Totally support such efforts!
I do think that there are likely limits in how modular you can make them, though.
There was an evolutionary point I thought bringing up above, which seems further evidence in favor of the limits of modularity / something I’ll fuzzily point to as “strong connectionism”:
It seems like the smarter animals generally have more of the undifferentiated parts of the brain, which can swap out in arbitrary ways. Substantia nigra, right, has been preserved for over a half billion years, and most of the animals which had it were pretty dumb—seems unlikely to be key to intelligence, at least higher intelligence. But the animals we tend to think of as smarter have a bunch of things like the cerebral cortex, which is super undifferentiated and very rewireable. My vague recollection is that smarter birds have similarly undifferentiated piece of their brain, which is not cortex? And I predict, without having checked, that things like the smarter octopuses will also have bigger non-differentiated zones (even if implemented differently) than stupider animals near them on the evolutionary tree.
I’d be curious if you agree with this? But this seems like further evidence that connectionism is just how intelligence gets implemented—if you want something flexible, which can respond to a whole lot of situation (i.e., intelligence), you’re going to need something super undifferentiated with some kinds of limit on modularity.
I quite disagree. I think that people imagine the cortex to be far more undifferentiated and reconfigurable than it actually is. What humans did to get extra intelligent and able to accumulate cultural knowledge was not just expand their cortex generally. In particular, we expanded our prefrontal cortex, and the lateral areas closely connected to it that are involved in language and abstract reasoning / math. And we developed larger neurons in our prefrontal cortex with more synapses, for better pooling of activity.
Impossible thought experiment: If you took a human infant and removed half their prefrontal cortex, and gave them the same amount new cortical in their visual cortex… then you don’t get a human who rewires their cortex such that the same proportion of cortex is devoted to prefrontal cortex (executive function, planning) and the same proportion devoted to vision. What you get is a subtle shift. The remaining prefrontal cortex will expand its influence to nearby regions, and maybe get about 2% more area than it otherwise would have had, but still be woefully inadequate. The visual cortex will not shrink much. You’ll have a human that’s terrible at planning and great at visual perception of complex scenes.
When neuroscientists talk about how impressively reconfigurable the cortex is, you have to take into consideration that the cortex is impressively reconfigurable given that it’s made up of neurons who are mostly locked into place before birth with very limited ability to change the location of their input zones or output zones.
For example: imagine a neuron is the size of a house. This particular neuron is located in San Diego. It’s dendrites stretch throughout the neighborhoods of San Diego. The dendrites have flexibility, in that they can choose which houses in the neighborhood to connect to, but can’t move more than a few houses in any direction.
Meanwhile, this neuron’s axon travels across the entire United States to end up in New York city. It ends in a particular neighborhood in Manhattan. Again, the axon can move a few buildings in either direction, but it can’t switch all the way from the northern end of Manhattan to the southern end. Much less choose to go to Washington DC instead. There is just no way that a neuron with its dendrites in Washington DC is going to end up directly connected to the San Diego neuron.
When someone looses a chunk of their cortex due to a stroke, they are often able to partially recover. The recovery is due largely to very local rewiring of the surviving mostly-closely-functionally-related neurons on the very edge of the damaged area. Like, if you place a dime on top of a penny. Imagine the dime is the area lost. The regain of function is mostly going to be dependent on the area represented by the amount of penny sticking out around the dime. Each neuron in that penny border region will shift its inputs and outputs more than neurons normally would, traveling maybe ten houses over instead of two. Still not a big change!
And no new neurons are added to most of the brain throughout life. Memory and olfactory systems do a bit of neurogenesis (but not making long range changes, just short range changes). For the rest of the brain, it’s just these very subtle rewiring changes, or changes to the strengths of the connections that do all of the learning. Neurons can be deleted, but not added. That’s a huge restriction that substantially reduces the ability of the network to change the way it works. Especially given that the original long-range wiring laid down in fetal development was almost entirely controlled by a small subset of our genetic code. So you don’t get to start with a random configuration, you start with a locked-in hardwired configuration. This is why we all have a visual cortex in the back of our heads. If the long-range wiring wasn’t hardcoded, some people would end up with it someplace else just by chance.
I don’t think the fact that neuronal wiring is mostly fixed provides much evidence that the cortex is not reconfigurable in the relevant sense. After all, neural networks have completely fixed wiring and can only change the strength of their connections, but this is enough to support great variability in function.
I have a response, but my response involves diagrams of the “downstream influence” of neurons in various brain circuits vs parameters in transformers. So, I’m working on a post about it. Sorry for the delay.
That’s a very interesting observation. As far as I understand as well, deep neural networks have completely unlimited rewirability—a particular “function” can exist anywhere in the network, in multiple places, or spread out between and within layers. It can be duplicated in multiple places. And if you retrain that same network, it will then be found in another place in another form. It makes it seem like you need something like a CNN to be able to successfully identify functional groups within another model, if it’s even possible.
I feel like this is missing the point. To me, when I heard the description of neural nets as large hard-to-interpret matrices, this clicked with me. Why? Because I’d spent years studying the brain. It’s not brain weights that make them more interpretable as systems. Heck no. All the points made about how biological networks are bad for interpretability are quite on point. The reason I think brains are more interpretable at a system level is for two reasons:
Modularity Look at the substantia nigra in the brain. That’s a module. It performs a critical, specific, hardwired task. The cortex is somewhat rewirable, but not very. And these modules are mostly located in the same places and performing mostly the same operations between people. Even between different species!
Limited rewirability The patterns of function in a brain get much more locked in than in a neural network. If you were trying to hide your fear from a neuro scientist who had the ability to monitor your brain activity with implanted electrodes, could you do it? I think you couldn’t, even with practice. You could train yourself to feel less fear, but you couldn’t experience the same amount of fear but hide the activity in a different part of your brain. Even if you also got to monitor your own brain activity and practice. I just don’t think that the brain is rewirable enough to manage that. Some stuff you could move or obfuscate, a bit, but it would be really hard and only partially successful.
So when people like Conjecture talk about building Cognitive Emulations that have highly verifiable functionality… I think that’s an awesome idea! I think that will work! (If given enough time and resources, which I fear it won’t be) And i think they should use big matrices of numbers to build them, but those matrices should be connected up to each other in specific hardcoded ways with deliberate information bottlenecks. I do not endorse spiking neural nets as inherently more interpretable. I endorse highly modular systems with deliberate restrictions on their ability to change their wiring.
I agree that modularity / information bottlenecks are desirable and compatible with connectionism, and a promising way of making a more interpretable AI. Separating agents out into world models and value functions, for instance, which don’t share gradients, likely will result in more easily interpretable systems than systems share gradients through both them. Totally support such efforts!
I do think that there are likely limits in how modular you can make them, though.
There was an evolutionary point I thought bringing up above, which seems further evidence in favor of the limits of modularity / something I’ll fuzzily point to as “strong connectionism”:
It seems like the smarter animals generally have more of the undifferentiated parts of the brain, which can swap out in arbitrary ways. Substantia nigra, right, has been preserved for over a half billion years, and most of the animals which had it were pretty dumb—seems unlikely to be key to intelligence, at least higher intelligence. But the animals we tend to think of as smarter have a bunch of things like the cerebral cortex, which is super undifferentiated and very rewireable. My vague recollection is that smarter birds have similarly undifferentiated piece of their brain, which is not cortex? And I predict, without having checked, that things like the smarter octopuses will also have bigger non-differentiated zones (even if implemented differently) than stupider animals near them on the evolutionary tree.
I’d be curious if you agree with this? But this seems like further evidence that connectionism is just how intelligence gets implemented—if you want something flexible, which can respond to a whole lot of situation (i.e., intelligence), you’re going to need something super undifferentiated with some kinds of limit on modularity.
I quite disagree. I think that people imagine the cortex to be far more undifferentiated and reconfigurable than it actually is. What humans did to get extra intelligent and able to accumulate cultural knowledge was not just expand their cortex generally. In particular, we expanded our prefrontal cortex, and the lateral areas closely connected to it that are involved in language and abstract reasoning / math. And we developed larger neurons in our prefrontal cortex with more synapses, for better pooling of activity.
Impossible thought experiment: If you took a human infant and removed half their prefrontal cortex, and gave them the same amount new cortical in their visual cortex… then you don’t get a human who rewires their cortex such that the same proportion of cortex is devoted to prefrontal cortex (executive function, planning) and the same proportion devoted to vision. What you get is a subtle shift. The remaining prefrontal cortex will expand its influence to nearby regions, and maybe get about 2% more area than it otherwise would have had, but still be woefully inadequate. The visual cortex will not shrink much. You’ll have a human that’s terrible at planning and great at visual perception of complex scenes.
When neuroscientists talk about how impressively reconfigurable the cortex is, you have to take into consideration that the cortex is impressively reconfigurable given that it’s made up of neurons who are mostly locked into place before birth with very limited ability to change the location of their input zones or output zones.
For example: imagine a neuron is the size of a house. This particular neuron is located in San Diego. It’s dendrites stretch throughout the neighborhoods of San Diego. The dendrites have flexibility, in that they can choose which houses in the neighborhood to connect to, but can’t move more than a few houses in any direction.
Meanwhile, this neuron’s axon travels across the entire United States to end up in New York city. It ends in a particular neighborhood in Manhattan. Again, the axon can move a few buildings in either direction, but it can’t switch all the way from the northern end of Manhattan to the southern end. Much less choose to go to Washington DC instead. There is just no way that a neuron with its dendrites in Washington DC is going to end up directly connected to the San Diego neuron.
When someone looses a chunk of their cortex due to a stroke, they are often able to partially recover. The recovery is due largely to very local rewiring of the surviving mostly-closely-functionally-related neurons on the very edge of the damaged area. Like, if you place a dime on top of a penny. Imagine the dime is the area lost. The regain of function is mostly going to be dependent on the area represented by the amount of penny sticking out around the dime. Each neuron in that penny border region will shift its inputs and outputs more than neurons normally would, traveling maybe ten houses over instead of two. Still not a big change!
And no new neurons are added to most of the brain throughout life. Memory and olfactory systems do a bit of neurogenesis (but not making long range changes, just short range changes). For the rest of the brain, it’s just these very subtle rewiring changes, or changes to the strengths of the connections that do all of the learning. Neurons can be deleted, but not added. That’s a huge restriction that substantially reduces the ability of the network to change the way it works. Especially given that the original long-range wiring laid down in fetal development was almost entirely controlled by a small subset of our genetic code. So you don’t get to start with a random configuration, you start with a locked-in hardwired configuration. This is why we all have a visual cortex in the back of our heads. If the long-range wiring wasn’t hardcoded, some people would end up with it someplace else just by chance.
I don’t think the fact that neuronal wiring is mostly fixed provides much evidence that the cortex is not reconfigurable in the relevant sense. After all, neural networks have completely fixed wiring and can only change the strength of their connections, but this is enough to support great variability in function.
I have a response, but my response involves diagrams of the “downstream influence” of neurons in various brain circuits vs parameters in transformers. So, I’m working on a post about it. Sorry for the delay.
That’s a very interesting observation. As far as I understand as well, deep neural networks have completely unlimited rewirability—a particular “function” can exist anywhere in the network, in multiple places, or spread out between and within layers. It can be duplicated in multiple places. And if you retrain that same network, it will then be found in another place in another form. It makes it seem like you need something like a CNN to be able to successfully identify functional groups within another model, if it’s even possible.