A sequence of posts on frameworks for brain computation

I plan on making a sequence of posts that discusses current frameworks for how brains compute. What do I mean by computation here? Roughtly speaking I want to answer how the activity of neurons implements cognitive function. You might say: well that’s not an issue, neurons fire, they cause other neurons to fire, those knock down some other neural dominoes, and then eventually they cause a motor neuron to fire and you say “Eureka, I have cognated!” This doesn’t seem like a great explanation to me, and I believe there are multiple frameworks that are starting to come together in neuroscience, artificial intelligence, cognitive science, dynamical systems theory, etc. etc. that provide an alternative to the standard point of view for how brains compute, hereafter called The Standard View. Hopefully some others will find this interesting. In this first post I’m going to talk about The Standard View and handwave about its shortcomings^[1]. Ultimately I believe the shortcomings are most clearly understood while being directly compared to alternative frameworks.

Everyone seems to think that the brain computes

Neuroscientists often say that “the brain computes.” This strikes most as such an obvious statement that follow-up questions like “what does compute mean in that phrase, exactly?” are met with a sigh that speaks quite clearly—“Please keep the philosophy aside, we are trying to do actual science here.” But my day job has been in experimental neuroscience for longer than a decade now, and I’ve never been able to shake the feeling that a lot of our understanding of the brain really hinges on figuring out what we mean by compute when we say the brain computes. That’s what I’m going to discuss here.

What I’m not interested in is if the brain is a Turing Machine or some other formal definition of computer, instead I want to take the intuitive notion that the brain computes as a given, and try to figure out what a typical neuroscientist might mean by that, and what are the possible ways we can understand natural systems computing more generally.

Single Neuron Computation is a straighforward case of the brain computing

A popular textbook in neuroscience called The Biophysics of Computation by Christof Koch famously starts with the phrase in question. It’s maybe no surprise that the book focuses mostly on the biophysics of single neurons, where the concept of computation has a straightforward interpretation. A single neuron gets some input in the form of synapses, the physiology of that neuron integrates those inputs through some (often quite complicated) nonlinear function, and produces an output in the form of a binary spike. The transformation from synaptic input to spiking output is what the word computation refers to.

We’ve learned a lot about this kind of computation since the textbook was published, and we are still learning more. For instance, a paper came out recently describing how human cortical neurons can use their biophysics to compute an XOR function on their inputs. Single Neuron Computation, the subfield of neuroscience I’m talking about here, has a decades long history, has clearly contributed to our understanding of the brain, and undoubtebly has more to contribute.

The standard view of whole brain computation is from sensory input to behavioral output

A seemingly good idea would be to apply this notion of compution to the brain as a whole. Let’s do that now. Generalizing from thinking about how single neurons compute, the following is a plan for how to talk about any system computing:

Identify what counts as an input.
Identify what counts as an output.
Observe, describe, and/or understand how the input is transformed into output.

In the case of single neurons those steps went as follows:

The input is synaptic signals from upstream neurons.
The output is the action potential (spike) generated by the neuron in question.
The transformation of input to output is dictated by the distribution of nonlinear ion channels that exist in the neurons membrane. Different neurons have different distributions of ion channels in their dendrites which give rise to different input-output functions. Using a number of methods including patch-clamp and computational modeling techniques, we can describe the input/output transformation and understand the membrane channel contributions to that transformation.

Ok so now what are the inputs for the brain as a whole? The standard response here is sensory input—auditory disturbances of the air near your eardrum caused by a crying baby, the photons hitting your retina after they traveled 91 million miles from the sun and bounced off a tree, etc. etc. What are the outputs? The nerve cell activity that directly control the state of the muscles and other structures that the brain controls in your body, including perhaps the state of your tongue and vocal chords when you speak, the modulatory signals to your heart and enteric nervous system that kick into gear when you are anxious or nervous, and the signals sent to your spinal chord that allow you precisely type on the keyboard with little active thought. The transformation of inpute to output is thought of as essentially a biologically instantiated artificial neural network. If you’re studying vision it’s not that different from a convolutional neural network. More generally you can think of it as a recurrent neural network. We can make this picture more complicated, but this gets at the standard view.

I’m interested here in the limitations of this framework for understanding how the brain computes. So first let’s look at some positive aspects of this view.

The Good of the Standard View

The Standard View allows for the brain to be arbitrarily complex. There’s nothing in it that assumes the brain is mainly feedforward, only made up of a single type of neuron, or has no memory. This is good because real brains are extrodainarily complex. It is often discussed how the number of synapses per neuron is ~10,000 and we have ~80 billion neurons. But there are also variety of timescales at play in brain dynamics. The spiking activity of a single neuron is timescale ~1 ms, the integration time constant of a neuron is ~10 ms, short-term plasticity mechanisms can be as long as ~100 ms, long-term plasticity involves the dynamics of protein expression changes and trafficking and can be on the seconds or even minutes timescales, and that’s not exhausting the relevant mechanisms for brain computation. There are neuromodulators, and neuropeptides, and astrocyte dynamics, and did you know that synapses are modulated by mechanical focres(?!), and entire classes of phenomenon that neuroscientists have yet to characterize fully or to understand the cognitive contexts in which they play an important role. It’s a jungle in there. But the point here is that The Standard View allows this jungle, at least in principle.

The Standard View plays nicely with evolution by natural selection. It is often said that nothing makes sense in biology except in the light of evolution. It is true that outward behavior is a very large contributor to an organisms evolutionary fitness. Brains do a lot to control outward behavior and how the organism interacts with its environment. Hold the transformation from environmental input to behavioral output fixed, but change the brain structure, and the fitness of the organism has likely not changed much. Here things get a little tricky since metabolic concerns come into play, but let’s leave that consideration aside. At the very least it makes sense to care about the transformation of sensory input to behavioral output when thinking from an evolutionary perspective, since selection obviously cares about the outward behavior of an organism in it’s given environment.

The Most Astonishing Fact (MAF) about our brains

But is The Standard View a useful frame for thinking about what is most interesting about human brains? Here is, in my opinion, the Most Astonishing Fact (MAF) about human brains: we can sit still in a dark room with hardly any appreciable dynamics in sensory inputs and practically the entire contents of our mind are flexibly and dynamically available to us. We can think of an elephant in a pink dress dancing the Macerena while reciting the Russian alphabet, we can replace the trunk of the elephant with a snake, etc. etc. Thus, on a timescale of tens of minutes, we can seemingly have the panopoly of mental contents swirling around, sometimes turbulent, sometimes orderly and laminar, but changing and combining and creating new mental constructs as time goes on. All of this can happen with negligible change in the sensory inputs and behavioral outputs, as concieved by The Standard View. This strikes me as so incredible that if I’m in the right mood I get a pange of numinous awe.

How does The Standard View deal with the MAF?

If, for the moment, we take The Standard View, how do we make sense of the MAF? In my experience I mostly get two related types of explanations from neuroscientists:

Sitting still and thinking about things is ultimately for outward behavior.
Sitting still and thinking about things is a byproduct of an optimization procedure for outward behavior.

Sometimes these are followed by something like “this isn’t really up for debate unless you don’t believe in evolution.” I don’t think either of these explanations help me to understand how the brain computes^[2]. Worse, it can mislead you to thinking you’ve found an explanation when you’ve just been answering a different question.

The Standard View can be misleading when thinking about human cognition and AGI

I think The Standard View is misleading when it’s invoked to make sense of the most interesting cognitive phenomenon. It’s not so much that I think it’s wrong, it just doesn’t seem like it does much in the way of explaining how, for instance, I can sit still and close my eyes and think. Concieving of inward thought as a process that is useful for behavior does not do all that much to explain how inward thought actually works. It doesn’t allow me to map from neural activity to thought.

What the Standard View is useful for is in giving a evolutionary process for how cognition might have come to be, or even how explicitly behavioral tasks might be learned. For instance, using The Standard View, I can reason that a good way to create an autonomous vehicle is to put some kind of optimization pressure that is a function of behavior. Thus, focus on finding the right loss function. This is all well and good.

But what it doesn’t do is answer how such systems actually solve those tasks, once learning on whatever timescales is done.

Now when I bring that point up with neuroscientists I sometimes get “well how the system computes is not a real question.” Once, a few years ago, when I thought I was having a friendly discussion with an established professor, I brought this up, and his response was “well I’m going to continue publishing in Science while you keep on wondering how to understand the brain. So good luck with that.”^[3]

A flavor of neuroscience research using deep learning has become more popular over the last few years. It goes like this: train your animal subject to perform some task while recording from tens to thousands of neurons, then train a deep neural network to perform that same (or similar) task. Now, compare the activations in the deep neural network to the neurons. If they are the same, claim that you’ve understood how the brain computes this task.

The reason this can be misleading for both neuroscience and AGI is that I think it’s easy in an approach like this to think you have a system that computes in a similar way to an intelligent system, without doing the hard work of understanding what it even means to compute, in the way that, for instance, we know brains do. The fact that artificial neural activations in your deep network look (loosely) like biological neural activity sounds cool, but unless we understand the relationship between the artificial neural activity and computation, we haven’t touched on that question. The most interesting computations that human brains do is not directly tied to behavioral output, nor can it be concieved of as a transformation from sensory input to behavioral output.

Increasingly high level cognition is increasingly distant from the transformation between sensory input and behavioral output

To make a little more concrete how computations that brains carry out in the service of high level cognition are not usefully concieved of as having to do with the transformation from sensory input to behavioral output, let’s try to bring in some neuroscience. The neocortex is generally thought to be the brain structure responsible for our most human of human cognitive function. We will outline a theory for how the neocrotex evolved, and then try to use that and The Standard View to make sense of how the neocortex supports cognition.

Here is a theory of how the neocortex and higher level cognition evolved. First there was no neocortex. All was lizard brain (subcortical brain structures). All was reflex, mediated by a small number of mostly feedforward synaptic chains connecting sensory input to motor output. These chains allow an organism to, for instance, run in the opposite direction of a loud noise, or close an eyelid when wind blows into it. Slowly, the pathways that mediate reflexes branched off into secondary pathways. The original reflexive pathways were kept intact, but now information was copied and offloaded to another group of neurons (the neocortex), where it could be processed and later reincorporated into the reflex chains. This gives us a simple modulation of a reflex. Eventually these secondary pathways become more complicated, and themselves branch, and have recurrent loops, and support long-term memory and flexible learning. And we’ve got ourselves a neocortex.

I like this theory. It concieves as the neocortex not as an independent processing unit of the brain, that allows us to play chess, but as a modulator of evolutinarily old reflexes. Generalizing this away from the brain and to our behavior, it’s fun to think of e.g. opening lines on Tinder as arising from the modulation of an extremely old evolutionary reflex and subcortical drive to have sex.

This sounds like it could be true. It is how The Standard View positions us to think about Tinder messaging—as the consequence of optimizing for a transformation between sensory input (visual cues of a conspecific, pheremones, etc.) and behavior, in this case mating behavior. But it doesn’t sound like a useful way to understand a brain actually comes up with an opening line on Tinder. The reason is that the more cognitive a thing is, the more it is seperated from the reflex. At some point, it is no longer useful to think of a modulation of a modulation of a modulation of a modulation… of a reflex as a modulation of a reflex. Instead, all of those layers of modulation should be more usefuly thought of as a thing in itself. A reflex is the direct transformation of a sensory input to a behavioral output. Cognition begins when such reflexes are modulated, and high-level human cognition, the stuff we think of when we think of AGI, is quite a long ways seperated from our reflexes—that is what makes it cognition.

A neuron-centric view of computation is needed

So if I don’t think that The Standard View is the right way to approach cognition, what point of view would be better? I don’t have a completely worked out answer about that. I have a few leads. In coming posts I hope to explore a few proposals, and summarize some of the work that’s been done in those directions. But what I think is needed, generally, is a neuron-centric view of what computation means. We need to be able to find the relevant structure of neural activity that explain cognitive function. This means something more than just finding neural correlations to human behavior during cognitive tasks in the lab. It means, loosely, being able to map from neural activity to cognition in such a way, that we should be able to apply this map to e.g. an artificial network and explain its cognition. It means finding neural structure that is internally realized. This and more in future posts!

^
Before I dig myself into too deep of a hole, I think there’s some amount of strawmanning I’m doing in this post. A lot of super interesting work has occured in both neuroscience and artificial intelligence research that at least touches on issues related to the issue I bring up here. I’m not going to go through that work here, maybe in a later post.
^
Mathematician Misha Gromov has a series of papers and writings about his thoughts about how the human brain works. He calls his theory ergobrain, and it is eccentric and amazing and he makes use of all the different fonts and colors that Microsoft Word has to offer to make his points. Here’s how he puts the point I’m trying to make here:
In other words, focusing on behavior to study the mental is misguided.
^
Instead of saying that researchers don’t care about the MAF, I would say that researchers rightly recognize the MAF as an incredibly difficult problem, and so either work on a different problem, or consider that the MAF might be a byproduct that falls out of the solution to some simpler problem. And I must admit, that in all the relevant aspects of my choices of projects and what I work on day to day, I am very much part of that cohort. I spend my days studying the brain from very much The Standard View. I publish papers wholy within the confines of The Standard View, when I give talks I talk as if I have no issues with The Standard View, and even when I criticize other’s work I do so using the perspective of The Standard View.

What do we mean when we say the brain computes?