RSS

A Ray

Karma: 860

Alex Gray née Alex Ray, much of my work is under that name. I’m interested in language model alignment, and especially techniques to get models to reason out loud.

Steganog­ra­phy in Chain of Thought Reasoning

A Ray8 Aug 2022 3:47 UTC
47 points
13 comments6 min readLW link

Why I Am Skep­ti­cal of AI Reg­u­la­tion as an X-Risk Miti­ga­tion Strategy

A Ray6 Aug 2022 5:46 UTC
31 points
14 comments2 min readLW link

My ad­vice on find­ing your own path

A Ray6 Aug 2022 4:57 UTC
34 points
3 comments3 min readLW link