If they set ηv to 1 they converge in a single backward pass1, since they then calculate precisely backprop. Setting ηv to less than that and perhaps mixing up the pass order merely obfuscates and delays this process, but converges because any neuron without incorrect children has nowhere to go but towards correctness. And the entire convergence is for a single input! After which they manually do a gradient step on the weights as usual.
[Preliminary edit: I think this was partly wrong. Replicating...]
It’s neat that you can treat activations and parameters by the same update rule, but then you should actually do it. Every “tick”, replace the input and label and have every neuron update its parameters and data in lockstep, where every neuron can only look at its neighbors. Of course, this only has a chance of working if the inputs and labels come from a continuous stream, as they would if the input were the output of another network. They also notice the possibility of continuous data. And then one could see how its performance degrades as one speeds up the poor brain’s environment :).
1: Which has to be in backward order and ϵi←vi−^vi has to be done once more after the v update line.
Of course, this only has a chance of working if the inputs and labels come from a continuous stream, as they would if the input were the output of another network.
Predictive processing is thus well-suited for BNNs because the real-time sensory data of a living organism, including sensory data preprocessed by another network, is a continuous stream.
If they set ηv to 1 they converge in a single backward pass1, since they then calculate precisely backprop. Setting ηv to less than that and perhaps mixing up the pass order merely obfuscates and delays this process, but converges because any neuron without incorrect children has nowhere to go but towards correctness. And the entire convergence is for a single input! After which they manually do a gradient step on the weights as usual.
[Preliminary edit: I think this was partly wrong. Replicating...]
It’s neat that you can treat activations and parameters by the same update rule, but then you should actually do it. Every “tick”, replace the input and label and have every neuron update its parameters and data in lockstep, where every neuron can only look at its neighbors. Of course, this only has a chance of working if the inputs and labels come from a continuous stream, as they would if the input were the output of another network. They also notice the possibility of continuous data. And then one could see how its performance degrades as one speeds up the poor brain’s environment :).
1: Which has to be in backward order and ϵi←vi−^vi has to be done once more after the v update line.
Predictive processing is thus well-suited for BNNs because the real-time sensory data of a living organism, including sensory data preprocessed by another network, is a continuous stream.