I just want to go on record and say this is a remarkable work of scholarship. I hold it to be one of a few best efforts at actually working through the difficulty and problems of alignment as we understand them now in early 2026. It feels like a step in squaring the stark pessimism of the List of Lethalities with the properties of current AI that make people guardedly optimistic (and others of us just somewhat less pessimistic).
I have comments on most of the many, many topics you touch on here, but not the time to write in depth.
Thanks! Glad you found it useful — I also found writing it pretty useful, it sparked a couple of research ideas. It also strongly encouraged me to put more effort into mentoring and helping out with fieldbuilding.
I just want to go on record and say this is a remarkable work of scholarship. I hold it to be one of a few best efforts at actually working through the difficulty and problems of alignment as we understand them now in early 2026. It feels like a step in squaring the stark pessimism of the List of Lethalities with the properties of current AI that make people guardedly optimistic (and others of us just somewhat less pessimistic).
I have comments on most of the many, many topics you touch on here, but not the time to write in depth.
Thanks! Glad you found it useful — I also found writing it pretty useful, it sparked a couple of research ideas. It also strongly encouraged me to put more effort into mentoring and helping out with fieldbuilding.