Jan_Kulveit

Karma: 8,595

My current research interests:

1. Alignment in systems which are complex and messy, composed of both humans and AIs?
Recommended texts: Gradual Disempowerment, Cyborg Periods

2. Actually good mathematized theories of cooperation and coordination
Recommended texts: Hierarchical Agency: A Missing Piece in AI Alignment, The self-unalignment problem or Towards a scale-free theory of intelligent agency (by Richard Ngo)

3. Active inference & Bounded rationality
Recommended texts: Why Simulator AIs want to be Active Inference AIs, Free-Energy Equilibria: Toward a Theory of Interactions Between Boundedly-Rational Agents, Multi-agent predictive minds and AI alignment (old but still mostly holds)

4. LLM psychology and sociology: A Three-Layer Model of LLM Psychology, The Pando Problem: Rethinking AI Individuality, The Cave Allegory Revisited: Understanding GPT’s Worldview

5. Macrostrategy & macrotactics & deconfusion: Hinges and crises, Cyborg Periods again, Box inversion revisited, The space of systems and the space of maps, Lessons from Convergent Evolution for AI Alignment, Continuity Assumptions

Also I occasionally write about epistemics: Limits to Legibility, Conceptual Rounding Errors

Researcher at Alignment of Complex Systems Research Group (acsresearch.org), Centre for Theoretical Studies, Charles University in Prague. Formerly research fellow Future of Humanity Institute, Oxford University

Previously I was a researcher in physics, studying phase transitions, network science and complex systems.

Role-playing vs Self-modelling

Jan_Kulveit7 Apr 2026 20:41 UTC

63 points

3 comments4 min readLW link

Persona Self-replication experiment

Jan_Kulveit, Raymond Douglas, vgel, Ondřej Havlíček, owencb and David Duvenaud

2 Apr 2026 18:18 UTC

39 points

0 comments8 min readLW link

(theartificialself.ai)

Persona self-replication experiment

Jan_Kulveit, Raymond Douglas, vgel, Ondřej Havlíček, owencb and David Duvenaud

2 Apr 2026 18:10 UTC

8 points

0 comments8 min readLW link

Latent Introspection (and other open-source introspection papers)

vgel, Martin Vaněk, Raymond Douglas and Jan_Kulveit

24 Mar 2026 21:23 UTC

96 points

3 comments9 min readLW link

(arxiv.org)

Models differ in identity propensities

Jan_Kulveit, Raymond Douglas, vgel, owencb, David Duvenaud and Ondřej Havlíček

16 Mar 2026 10:45 UTC

58 points

0 comments14 min readLW link

The Artificial Self

Jan_Kulveit, Raymond Douglas, vgel, owencb, David Duvenaud and Ondřej Havlíček

15 Mar 2026 1:37 UTC

117 points

13 comments29 min readLW link

An Alignment Journal: Coming Soon

Dan MacKinlay, JessRiedel, Edmund Lau, Daniel Murfet, Scott Aaronson, Jan_Kulveit, david reinstein, Alexander Gietelink Oldenziel and Marcus Hutter

3 Mar 2026 20:27 UTC

252 points

29 comments6 min readLW link

(blog.alignmentjournal.org)

Why You Don’t Believe in Xhosa Prophecies

Jan_Kulveit13 Feb 2026 2:25 UTC

261 points

27 comments4 min readLW link

Post-AGI Economics As If Nothing Ever Happens

Jan_Kulveit4 Feb 2026 17:39 UTC

252 points

39 comments8 min readLW link

(boundedlyrational.substack.com)

When does competition lead to recognisable values?

Jan_Kulveit, beren, David Duvenaud and Raymond Douglas

12 Jan 2026 23:13 UTC

65 points

18 comments25 min readLW link

(postagi.org)

The Economics of Transformative AI

Jan_Kulveit, David Duvenaud and Raymond Douglas

8 Jan 2026 22:22 UTC

64 points

3 comments18 min readLW link

(post-agi.org)

Ontology for AI Cults and Cyborg Egregores

Jan_Kulveit10 Nov 2025 13:19 UTC

65 points

14 comments2 min readLW link

AISLE discovered three new OpenSSL vulnerabilities

Jan_Kulveit30 Oct 2025 16:32 UTC

64 points

7 comments1 min readLW link

(aisle.com)

Upcoming Workshop on Post-AGI Economics, Culture, and Governance

David Duvenaud, Raymond Douglas, Jan_Kulveit, scasper and MariaK

28 Oct 2025 21:55 UTC

43 points

1 comment2 min readLW link

The Memetics of AI Successionism

Jan_Kulveit28 Oct 2025 15:04 UTC

225 points

55 comments9 min readLW link

Summary of our Workshop on Post-AGI Outcomes

David Duvenaud, Raymond Douglas, Nora_Ammann and Jan_Kulveit

29 Aug 2025 17:14 UTC

110 points

3 comments3 min readLW link

Upcoming workshop on Post-AGI Civilizational Equilibria

David Duvenaud, Jan_Kulveit, Raymond Douglas, Nora_Ammann and David Scott Krueger (formerly: capybaralet)

21 Jun 2025 15:57 UTC

25 points

0 comments1 min readLW link

Do Not Tile the Lightcone with Your Confused Ontology

Jan_Kulveit13 Jun 2025 12:45 UTC

236 points

27 comments5 min readLW link

(boundedlyrational.substack.com)

Apply now to Human-Aligned AI Summer School 2025

VojtaKovarik, Tomáš Gavenčiak and Jan_Kulveit

6 Jun 2025 19:31 UTC

28 points

1 comment2 min readLW link

(humanaligned.ai)

Individual AI representatives don’t solve Gradual Disempowerement

Jan_Kulveit4 Jun 2025 1:26 UTC

62 points

4 comments3 min readLW link