RSS

A Ray

Karma: 840

Alex Gray née Alex Ray, much of my work is under that name. I’m interested in language model alignment, and especially techniques to get models to reason out loud.

Steganog­ra­phy in Chain of Thought Reasoning

A Ray8 Aug 2022 3:47 UTC
39 points
13 comments6 min readLW link

Why I Am Skep­ti­cal of AI Reg­u­la­tion as an X-Risk Miti­ga­tion Strategy

A Ray6 Aug 2022 5:46 UTC
31 points
14 comments2 min readLW link

My ad­vice on find­ing your own path

A Ray6 Aug 2022 4:57 UTC
34 points
3 comments3 min readLW link