yix

Karma: 254

https://yixiong.dev/

Teaching Models to Dream of Better Monitors through Evaluation Conditioned Training

Alec Harris, Kasey C, Archie Chaudhury and yix

19 Mar 2026 21:01 UTC

49 points

2 comments10 min readLW link

We need a better way to evaluate emergent misalignment

yix and Broyojo

11 Jan 2026 16:21 UTC

86 points

9 comments6 min readLW link

yix’s Shortform

yix6 Dec 2025 2:27 UTC

2 points

2 comments1 min readLW link

TastyBench: Toward Measuring Research Taste in LLM

Parv Mahajan, Yilin and yix

2 Dec 2025 23:26 UTC

33 points

2 comments6 min readLW link

Lessons from a year of university AI safety field building

yix, afterless, Parv Mahajan, Andersehen, Tuna and neverix

6 Jun 2025 14:35 UTC

35 points

3 comments7 min readLW link

College technical AI safety hackathon retrospective—Georgia Tech

yix15 Nov 2024 0:22 UTC

44 points

2 comments5 min readLW link

(open.substack.com)