I run the Model Transparency Work-stream at the UK AI Security Institute. We are generally concerned with oversight, including monitorability, auditing and (preventing / detecting) scheming. Previously, I cofounded the AI Safety Research Infrastructure Org Decode Research and conducted independent research into mechanistic interpretability of decision transformers. I studied computational biology and statistics at the University of Melbourne in Australia.
Joseph Bloom
Karma: 1,967