Alternative frame: I’ve been poking at the idea of quantum resource theories periodically, literally on the strength of a certain word-similarity between quantum stuff and alignment stuff.

The root inspiration for this comes from Scott Aaronson’s Quantum Computing Since Democritus, specifically two things: one, the “certain generalization of probability” lens pretty directly liberates me to throw QM ideas at just about anything, the same way I might with regular probability; two, the introduction of negative probability and through that “cancelling out” possibilities is super cool and feels like a useful way to think about certain problems.

So, babbling: can we loot resource theories from quantum thermodynamics as a way to reason more precisely about the constraints we want for alignment?

A Quanta article animating the thought: https://www.quantamagazine.org/physicists-trace-the-rise-in-entropy-to-quantum-information-20220526/

Direct quote -

“A resource theory is a simple model for any situation in which the actions you can perform and the systems you can access are restricted for some reason,” said the physicist Nicole Yunger Halpern of the National Institutes of Standards and Technology.

This sounds like a good match for alignment-ish problems on the face of it. In the alignment case the *some reason* for the restrictions is *so it doesn’t kill us. *There are two elements to the resource theory: firstly a set of free operations and states we assume can be gotten to at no cost; secondly valuable resources like entanglement, purity, and asymmetry which are states which can be achieved at a cost (and therefore are limited). The gist is, what if we swapped out words like *entanglement* and *purity* with words like *corrigibility *and *interpretability*?

I am beginning to think that histories of mathematical struggle and failure are my favorite kind. One that is similarly a tale of challenging and repeated failures on an unintuitive subject is thermodynamics, and an amazing book on this subject is

The Tragicomical History of Thermodynamics, 1822–1854by Clifford Truesdell, himself a mathematical physicist most famous for continuum mechanics.