Thank you for that series! Learnt about it from Scott’s book review, and decided to read the original.
The first half of this post is the conventional basic knowledge from neuroscience, as I understand it. I was following and nodding along and thinking “yeah this is cool, makes sense” until section 1.4, where the solid before logic started breaking down a bit, or at least it seems so to me.
Before that, when you were talking about predicting, you were talking about predicting sensory input. There is some suspiciously car-shaped sensory input on my retina, then I get engine-and-tires-shaped sensory input in my ears. I would be less surprised to hear “wrrrr” after I see something car-shaped if I develop a “car” concept and learn to invoke it when I see something car-shaped, which is most likely a car.
Then, if I see a road, I would be less surprised when I hear “wrrrr” if I learn to invoke the “car” concept even before seeing a car, in a situation where cars are likely to appear. “Less surprised” in a technical sense, obviously: I assign more probability to hearing “wrrrr” when I see a road, because of the “car” model being active. There is a learned connection between “road-shaped sensory input” → “ ‘road’ concept” → “ ‘car’ concept” → “prediction of car-shaped sensory input” because the car-shaped sensory input just follows the road-shaped sensory input. When I observe one, I actually expect the other.
Then you introduce the distinction between the model being active for “exogenous” and “endogenous” reasons, and start talking about how predicting when a model will be active for endogenous reasons is good for predicting… what? It kind of feels like we’ve lost the “sensory data” in “predicting sensory data”, and now just predicting when the concept itself is active became good for some reason. Predicting when the “car” concept will be active for exogenous reasons was good because it’s likely that the sensory data predictable by the car concept will follow. Is it at all the case with the “endogenous” reasons? Let’s look at the examples in the post:
I’m thinking about screws right now [? not sure]
I’m worried about the screws [NO?]
I can never remember where I left the screws [YES? --- you would kind of go looking for screws and then maybe find them?]
Maybe the “car” concept is active in my mind because it spontaneously occurred to me that it would be a good idea to go for a drive right about now [YES]
Or maybe it’s on my mind because I’m anxious about cars [NO]
It kind of goes either way, and if predicting the sensory input was the actual end goal of the predictive algorithm, wouldn’t the distinction between the two cases be very important, and wouldn’t it be worth predicting only one of them?
I think it would help if you clarified what we are actually predicting by predicting when some concept will be active, and why we are doing that.
We don’t need to explain why/whether predicting “endogenous” activations is good, if we accept hypothesis that that’s how brain is wired—it runs prediction learning by default. It makes sense, because the affected cluster of neurons doesn’t know if this activation is exo or endo. Prediction learning for endo activations is conceptually the learning of shortcuts: if screw model activation predictably leads through a chain of intermediate steps to “being worried” model, then a good predictor would learn to activate the latter model right away after seeing screw.
Thank you for that series! Learnt about it from Scott’s book review, and decided to read the original.
The first half of this post is the conventional basic knowledge from neuroscience, as I understand it. I was following and nodding along and thinking “yeah this is cool, makes sense” until section 1.4, where the solid before logic started breaking down a bit, or at least it seems so to me.
Before that, when you were talking about predicting, you were talking about predicting sensory input. There is some suspiciously car-shaped sensory input on my retina, then I get engine-and-tires-shaped sensory input in my ears. I would be less surprised to hear “wrrrr” after I see something car-shaped if I develop a “car” concept and learn to invoke it when I see something car-shaped, which is most likely a car.
Then, if I see a road, I would be less surprised when I hear “wrrrr” if I learn to invoke the “car” concept even before seeing a car, in a situation where cars are likely to appear. “Less surprised” in a technical sense, obviously: I assign more probability to hearing “wrrrr” when I see a road, because of the “car” model being active. There is a learned connection between “road-shaped sensory input” → “ ‘road’ concept” → “ ‘car’ concept” → “prediction of car-shaped sensory input” because the car-shaped sensory input just follows the road-shaped sensory input. When I observe one, I actually expect the other.
Then you introduce the distinction between the model being active for “exogenous” and “endogenous” reasons, and start talking about how predicting when a model will be active for endogenous reasons is good for predicting… what? It kind of feels like we’ve lost the “sensory data” in “predicting sensory data”, and now just predicting when the concept itself is active became good for some reason. Predicting when the “car” concept will be active for exogenous reasons was good because it’s likely that the sensory data predictable by the car concept will follow. Is it at all the case with the “endogenous” reasons? Let’s look at the examples in the post:
It kind of goes either way, and if predicting the sensory input was the actual end goal of the predictive algorithm, wouldn’t the distinction between the two cases be very important, and wouldn’t it be worth predicting only one of them?
I think it would help if you clarified what we are actually predicting by predicting when some concept will be active, and why we are doing that.
We don’t need to explain why/whether predicting “endogenous” activations is good, if we accept hypothesis that that’s how brain is wired—it runs prediction learning by default. It makes sense, because the affected cluster of neurons doesn’t know if this activation is exo or endo.
Prediction learning for endo activations is conceptually the learning of shortcuts: if screw model activation predictably leads through a chain of intermediate steps to “being worried” model, then a good predictor would learn to activate the latter model right away after seeing screw.
That makes sense, thanks!