Eyon Jang

Karma: 88

AI safety researcher; MATS 8.0 scholar

Exploration Hacking: Can LLMs Learn to Resist RL Training?

Eyon Jang, Joschka Braun, Damon Falck and David Lindner

1 May 2026 20:54 UTC

17 points

0 comments8 min readLW link

Helping Friends, Harming Foes: Testing Tribalism in Language Models

Irakli Shalibashvili, Jer Ren Wong, Moksh Nirvaan, Diogo Cruz and Eyon Jang

11 Mar 2026 12:06 UTC

10 points

0 comments9 min readLW link

A Conceptual Framework for Exploration Hacking

Joschka Braun, Eyon Jang and Damon Falck

12 Feb 2026 16:33 UTC

26 points

2 comments9 min readLW link

Exploration hacking: can reasoning models subvert RL?

Damon Falck, Joschka Braun and Eyon Jang

30 Jul 2025 22:02 UTC

25 points

4 comments9 min readLW link

Automating AI Safety: What we can do today

Matthew Shinkle, Eyon Jang and jacquesthibs

25 Jul 2025 14:49 UTC

38 points

0 comments8 min readLW link