Growing Bonsai Networks with RNNs

A visualization of a sparse computational graph pruned from a RNN. Square nodes represent neurons and circles are states from the previous timestep. Nodes and edges are according to their current output with blue being negative and red positive."/><meta data-react-helmet="true" name="twitter:image:alt" content="A visualization of a sparse computational graph pruned from a RNN. Square nodes represent neurons and circles are states from the previous timestep. Nodes and edges are according to their current output with blue being negative and red positive.

This is a linkpost for a writeup on my personal website: https://cprimozic.net/blog/growing-sparse-computational-graphs-with-rnns/

Here’s a summary:

This post contains an overview of my research and experiments on growing sparse computational graphs I’m calling “Bonsai Networks” by training small RNNs. It describes the architecture, training process, and pruning methods used to create the graphs and then examines some of the learned solutions to a variety of objectives.
Its main theme is mechanistic interpretability, but it also goes into significant detail on the technical side of the implementation for the training stack, a custom activation function, bespoke sparsity-promoting regularizer, and more.

The site contains a variety of interactive visualizations and other embeds that are important to its content. That’s why I chose to make this a linkpost rather than copy its content here directly.

I’d love to receive any feedback you might have on this work. This topic is something I’m very interested in, and I’m eager to hear peoples’ thoughts on it.