This section of last year’s shallow review of TAIS is 3 months out of date and maybe too coarse-grained but is a decent starting point?
Agent foundations
Develop philosophical clarity and mathematical formalizations of building blocks that might be useful for plans to align strong superintelligence, such as agency, optimization strength, decision theory, abstractions, concepts, etc.
Theory of change: Rigorously understand optimization processed and agents, and what it means for them to be aligned in a substrate independent way → identify impossibility results and necessary conditions for aligned optimizer systems → use this theoretical understanding to eventually design safe architectures that remain stable and safe under self-reflection
This section of last year’s shallow review of TAIS is 3 months out of date and maybe too coarse-grained but is a decent starting point?