1a3orn comments on Request: stop advancing AI capabilities

1a3orn 27 May 2023 14:00 UTC
26 points
20
The probably-canonical example at the moment is Hyena Hierarchy, which cites a bunch of interpretability research, including Anthropic’s stuff on Induction Heads. If HH actually gives what it promises in the paper, it might enable way longer context.

I don’t think you even need to cite that though. If interpretability wants to be useful someday, I think interpretability has to be ultimately aimed at helping steer and build more reliable DL systems. Like that’s the whole point, right? Steer a reliable ASI.