RSS

Joar Skalse

Karma: 647

My name is pronounced “YOO-ar SKULL-se”. I’m a PhD student at Oxford University, and I have worked in several different areas of AI safety research. For a few highlights, see:

Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems

Misspecification in Inverse Reinforcement Learning

STARC: A General Framework For Quantifying Differences Between Reward Functions

Risks from Learned Optimization in Advanced Machine Learning Systems

Is SGD a Bayesian sampler? Well, almost

For a full list of all my research, see my Google Scholar.

Towards Guaran­teed Safe AI: A Frame­work for En­sur­ing Ro­bust and Reli­able AI Systems

Joar Skalse17 May 2024 19:13 UTC
67 points
10 comments2 min readLW link