“Why Not Just...”

A compendium of rants about alignment proposals, of varying charitability.

Deep Learn­ing Sys­tems Are Not Less In­ter­pretable Than Logic/​Prob­a­bil­ity/​Etc

Godzilla Strategies

Rant on Prob­lem Fac­tor­iza­tion for Alignment

In­ter­pretabil­ity/​Tool-ness/​Align­ment/​Cor­rigi­bil­ity are not Composable

How To Go From In­ter­pretabil­ity To Align­ment: Just Re­tar­get The Search

Over­sight Misses 100% of Thoughts The AI Does Not Think

Hu­man Mimicry Mainly Works When We’re Already Close

Wor­lds Where Iter­a­tive De­sign Fails

Why Not Just… Build Weak AI Tools For AI Align­ment Re­search?

Why Not Just Out­source Align­ment Re­search To An AI?

OpenAI Launches Su­per­al­ign­ment Taskforce