Alignment Stream of Thought

This sequence is for posts that have relatively less polish and smaller scope, but which describe ideas or thoughts I think may be interesting (or wrong in interesting ways), or want to iterate on quickly.

I’m trying this out because I’ve noticed I spent way too much time editing posts normally (a significant chunk of which is usually staring at a screen frowning at framings), and in the process come across new ideas I have to compress down to not have scope creep in unrelated posts. I expect this to allow smaller posts focusing on isolated ideas before I’ve fully figured them out.

Expect low-to-middling confidence in any conclusions drawn, and occasionally just chains of reasoning without properly contextualized conclusions.

Inspired by Leo Gao.

[ASoT] Fine­tun­ing, RL, and GPT’s world prior

[ASoT] Si­mu­la­tors show us be­havi­oural prop­er­ties by default