[Question] Previous Work on Recreating Neural Network Input from Intermediate Layer Activations

Recently I’ve been experimenting with recreating a neural network’s input layer from intermediate layer activations.

The possibility has implications for interpretability. For example, if certain neurons are activated on certain input, you know those neurons are ‘about’ that type of input.

My question is: Does anyone know of prior work/​research in this area?

I’d appreciate even distantly-related work. I may write a blog post about my experiments if there is an interest and if there isn’t already adequate research in this area.

No comments.