RSS

Yuxiao

Karma: 12

I’m an AI safety researcher — mostly working on ways to see inside the systems we’ve built and understand what moves them. My background runs through statistical inference, machine learning, and generative models; lately I’ve been in the borderlands between mechanistic interpretability and probabilistic thinking, trying to make large language models a little less opaque.

I’ve moved between academia, industry, and independent research, but the constant thread is the same: bridging abstract mathematics with the hidden structures of deep networks. I view this blog as the place to keep scientific diary, maintain emotional balance, and make friends :)

From Orag­nized Shelves to Lay­ered Cat­a­logs: Ar­chi­tec­tural Ex­plo­ra­tions for Sparse Au­toen­coders—Cross­coders & Lad­der SAEs Towards Hier­ar­chi­cal Data Structure

Yuxiao10 Aug 2025 10:12 UTC
2 points
0 comments11 min readLW link

From Messy Shelves to Master Librar­i­ans: Toy-Model Ex­plo­ra­tion of Block-Di­ag­o­nal Geom­e­try in LM Activations

Yuxiao19 Jul 2025 12:26 UTC
5 points
1 comment4 min readLW link

From Un­ruly Stacks to Or­ga­nized Shelves: Toy Model Val­i­da­tion of Struc­tured Pri­ors in Sparse Autoencoders

Yuxiao6 Jul 2025 7:03 UTC
8 points
0 comments5 min readLW link