I can imagine, on the one hand, there being a fairly direct link from a neuron being predicted to the prediction output.
Is something like this what you were getting at in other comments with respect to the model doing self-prediction by learning the identity matrix? That wasn’t quite clear to me.
Oh, and a question. I say above:
Is something like this what you were getting at in other comments with respect to the model doing self-prediction by learning the identity matrix? That wasn’t quite clear to me.