AI safety, risks from global systems, philosophy, engineering physics, complex adaptive systems, values for the future, future of intelligence.
Scale, cascades and cumulative impacts from interconnected systems. AI systems’ impact on human cognitive and moral autonomy. Evaluations, measurement, robustness, standards and monitoring of AI systems.
Interest in alignment approaches at different abstraction levels, e.g., macro-level interpretability, scaffolding/module-level AI system safety and systems-level theoretic process analysis for safety.
Hey, great stuff—thank you for sharing! I especially found this useful as somebody who has been “out” of alignment for 6 months and is looking to set up a new research agenda.