I would really love to see this combined with a neuroscope so you can play around with the neurons easily and test your hypotheses on what it means!
I also find it pretty fun to try to figure out what a neuron is activating for, and it seems plausibly that this is something that could be gamified+crowd sourced (a la FoldIt) to great effect, even without the use of GPT-4 to generate explanations (still used to validate submitted answers). This probably wouldn’t scale to a GPT-3+ sized network, but it might still be helpful at e.g. surfacing interesting neurons, or training an AI to interpret neurons more effectively.
I would really love to see this combined with a neuroscope so you can play around with the neurons easily and test your hypotheses on what it means!
I also find it pretty fun to try to figure out what a neuron is activating for, and it seems plausibly that this is something that could be gamified+crowd sourced (a la FoldIt) to great effect, even without the use of GPT-4 to generate explanations (still used to validate submitted answers). This probably wouldn’t scale to a GPT-3+ sized network, but it might still be helpful at e.g. surfacing interesting neurons, or training an AI to interpret neurons more effectively.