Thanks for this post. I am theoretical physicist and I agree that LLM interpretability is a physics problem. I think more and more physicists are becoming interested in this problem because it is obvious to us that physics methods, ideas and tools will be useful here. The whole field of physics is just mechanistic interpretability of the universe. Physics is quite broad and there are no fixed rules. The only goal is to increase understanding in any possible way and in fact one should always change and redefine the rules of the game. LLMs (as a toy model of intelligence) is just another complex system and one with which it is very easy to experiment.
I agree! LLMs are in many ways self-organized complex systems. And they are very easy to experiment on. Which questions about LLM interpretability do you think theoretical physics can help investigate?
Thanks for this post. I am theoretical physicist and I agree that LLM interpretability is a physics problem. I think more and more physicists are becoming interested in this problem because it is obvious to us that physics methods, ideas and tools will be useful here. The whole field of physics is just mechanistic interpretability of the universe. Physics is quite broad and there are no fixed rules. The only goal is to increase understanding in any possible way and in fact one should always change and redefine the rules of the game. LLMs (as a toy model of intelligence) is just another complex system and one with which it is very easy to experiment.
I agree! LLMs are in many ways self-organized complex systems. And they are very easy to experiment on.
Which questions about LLM interpretability do you think theoretical physics can help investigate?