Formerly a software engineer at Google, now I’m doing independent AI alignment research. My work is currently supported by the Future Fund regranting program.
I’m always happy to connect with other researchers or people interested in AI alignment and effective altruism. Feel free to send me a private message!
Posts and comments I’ve written (scroll down to read)
Work by other people that I reviewed early drafts of:
Circumventing interpretability: How to defeat mind-readers by Lee Sharkey
Actionable Guidance for High-Consequence AI Risk Management: Towards Standards Addressing AI Catastrophic Risks by Tony Barrett, Dan Hendryks, Jessica Newman and Brandie Nonnecke
DeepMind’s generalist AI, Gato: A non-technical explainer by Frances Lorenz, Nora Belrose and Jon Menaster
Potential Alignment mental tool: Keeping track of the types by Donald Hobson
Submissions for the AI Safety Arguments Competition
Ideal Governance by Holden Karnofsky