alexr

Karma: 60

I am an effective altruist, and AI Safety researcher.
I hold a PhD in Physics from the University of Florida, and I am currently a visiting professor of Machine Learning in the New College of Florida—Data Science Master’s Program

I became interested in AI safety in 2015 when I read ‘Superintelligence’. A few years later I picked up ‘Rationality form AI to Zombies’ and got interested in “debiasing”. One of the many views I changed as a result of reading this was becoming an EA.

Towards Shutdownable Agents: Generalizing Stochastic Choice in RL Agents and LLMs

Elliott Thornley (EJT), carissacullen, christosi, alexr, LAThomson and Harry Garland

3 Jun 2026 14:24 UTC

20 points

3 comments19 min readLW link

(arxiv.org)

Towards shutdownable agents via stochastic choice

Elliott Thornley (EJT), alexr, christosi and LAThomson

8 Jul 2024 10:14 UTC

59 points

11 comments23 min readLW link

(arxiv.org)

alexr

Towards Shut­down­able Agents: Gen­er­al­iz­ing Stochas­tic Choice in RL Agents and LLMs

Towards shut­down­able agents via stochas­tic choice

Towards Shutdownable Agents: Generalizing Stochastic Choice in RL Agents and LLMs

Towards shutdownable agents via stochastic choice