RSS

Joar Skalse

Karma: 738

My name is pronounced “YOO-ar SKULL-se” (the “e” is not silent). I’m a PhD student at Oxford University, and I was a member of the Future of Humanity Institute before it shut down. I have worked in several different areas of AI safety research. For a few highlights, see:

  1. Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems

  2. Misspecification in Inverse Reinforcement Learning

  3. STARC: A General Framework For Quantifying Differences Between Reward Functions

  4. Risks from Learned Optimization in Advanced Machine Learning Systems

  5. Is SGD a Bayesian sampler? Well, almost

Some of my recent research on the theoretical foundations of reward learning is also described in this sequence.

For a full list of all my research, see my Google Scholar.