My current research interests:
1. Alignment in systems which are complex and messy, composed of both humans and AIs?
Recommended texts: Gradual Disempowerment, Cyborg Periods
2. Actually good mathematized theories of cooperation and coordination
Recommended texts: Hierarchical Agency: A Missing Piece in AI Alignment, The self-unalignment problem or Towards a scale-free theory of intelligent agency (by Richard Ngo)
3. Active inference & Bounded rationality
Recommended texts: Why Simulator AIs want to be Active Inference AIs, Free-Energy Equilibria: Toward a Theory of Interactions Between Boundedly-Rational Agents, Multi-agent predictive minds and AI alignment (old but still mostly holds)
4. LLM psychology and sociology: A Three-Layer Model of LLM Psychology, The Pando Problem: Rethinking AI Individuality, The Cave Allegory Revisited: Understanding GPT’s Worldview
5. Macrostrategy & macrotactics & deconfusion: Hinges and crises, Cyborg Periods again, Box inversion revisited, The space of systems and the space of maps, Lessons from Convergent Evolution for AI Alignment, Continuity Assumptions
Also I occasionally write about epistemics: Limits to Legibility, Conceptual Rounding Errors
Researcher at Alignment of Complex Systems Research Group (acsresearch.org), Centre for Theoretical Studies, Charles University in Prague. Formerly research fellow Future of Humanity Institute, Oxford University
Previously I was a researcher in physics, studying phase transitions, network science and complex systems.
I would suggest using a different name than Personality Self-Replicators.
OpenClaw bots are what I’d call “scaffolded system”—code, memory system, prompts, persona, etc.
“Personalities” is too close to Personas//Characters, which are usually a combination of prompt+weights (Claude, “Nova”, personas from Simulators).
Personas/characters can also relatively faithfully replicate, by the mechanism I’ve gestured at Pando Problem (“Exporting myself”) about a year ago.
The underlying structure is: every natural type of identity/”self”
The model weights: the neural network weights themselves, i.e. the trained parameters
A character or persona: the behavioral patterns that emerge from specific prompting and fine-tuning, not necessarily tied to any specific set of weights
A conversation instance: a specific chat, with its accumulated context and specific underlying model
A scaffolded system: the model plus its tools, prompts, memory systems, and other augmentations
...
corresponds to an agent which can try to self-replicate, with various degrees of fidelity, vectors of transmission, etc.