RSS

Jan_Kulveit

Karma: 7,058

My current research interests:

1. Alignment in systems which are complex and messy, composed of both humans and AIs?
Recommended texts: Gradual Disempowerment, Cyborg Periods

2. Actually good mathematized theories of cooperation and coordination
Recommended texts: Hierarchical Agency: A Missing Piece in AI Alignment, The self-unalignment problem or Towards a scale-free theory of intelligent agency (by Richard Ngo)

3. Active inference & Bounded rationality
Recommended texts: Why Simulator AIs want to be Active Inference AIs, Free-Energy Equilibria: Toward a Theory of Interactions Between Boundedly-Rational Agents, Multi-agent predictive minds and AI alignment (old but still mostly holds)

4. LLM psychology and sociology: A Three-Layer Model of LLM Psychology, The Pando Problem: Rethinking AI Individuality, The Cave Allegory Revisited: Understanding GPT’s Worldview

5. Macrostrategy & macrotactics & deconfusion: Hinges and crises, Cyborg Periods again, Box inversion revisited, The space of systems and the space of maps, Lessons from Convergent Evolution for AI Alignment, Continuity Assumptions

Also I occasionally write about epistemics: Limits to Legibility, Conceptual Rounding Errors

Researcher at Alignment of Complex Systems Research Group (acsresearch.org), Centre for Theoretical Studies, Charles University in Prague. Formerly research fellow Future of Humanity Institute, Oxford University

Previously I was a researcher in physics, studying phase transitions, network science and complex systems.

On­tol­ogy for AI Cults and Cy­borg Egregores

Jan_Kulveit10 Nov 2025 13:19 UTC
59 points
14 comments2 min readLW link

AISLE dis­cov­ered three new OpenSSL vulnerabilities

Jan_Kulveit30 Oct 2025 16:32 UTC
63 points
7 comments1 min readLW link
(aisle.com)

Up­com­ing Work­shop on Post-AGI Eco­nomics, Cul­ture, and Governance

28 Oct 2025 21:55 UTC
37 points
1 comment2 min readLW link

The Memet­ics of AI Successionism

Jan_Kulveit28 Oct 2025 15:04 UTC
212 points
54 comments9 min readLW link

Sum­mary of our Work­shop on Post-AGI Outcomes

29 Aug 2025 17:14 UTC
107 points
3 comments3 min readLW link

Up­com­ing work­shop on Post-AGI Civ­i­liza­tional Equilibria

21 Jun 2025 15:57 UTC
25 points
0 comments1 min readLW link

Do Not Tile the Light­cone with Your Con­fused Ontology

Jan_Kulveit13 Jun 2025 12:45 UTC
230 points
27 comments5 min readLW link
(boundedlyrational.substack.com)

Ap­ply now to Hu­man-Aligned AI Sum­mer School 2025

6 Jun 2025 19:31 UTC
28 points
1 comment2 min readLW link
(humanaligned.ai)

In­di­vi­d­ual AI rep­re­sen­ta­tives don’t solve Grad­ual Disempowerement

Jan_Kulveit4 Jun 2025 1:26 UTC
60 points
3 comments3 min readLW link

The Pando Prob­lem: Re­think­ing AI Individuality

Jan_Kulveit28 Mar 2025 21:03 UTC
133 points
14 comments13 min readLW link

Con­cep­tual Round­ing Errors

Jan_Kulveit26 Mar 2025 19:00 UTC
153 points
15 comments3 min readLW link
(boundedlyrational.substack.com)

An­nounc­ing EXP: Ex­per­i­men­tal Sum­mer Work­shop on Col­lec­tive Cognition

15 Mar 2025 20:14 UTC
36 points
2 comments4 min readLW link

AI Con­trol May In­crease Ex­is­ten­tial Risk

Jan_Kulveit11 Mar 2025 14:30 UTC
101 points
13 comments1 min readLW link

Grad­ual Disem­pow­er­ment, Shell Games and Flinches

Jan_Kulveit2 Feb 2025 14:47 UTC
145 points
36 comments6 min readLW link

Grad­ual Disem­pow­er­ment: Sys­temic Ex­is­ten­tial Risks from In­cre­men­tal AI Development

30 Jan 2025 17:03 UTC
167 points
65 comments2 min readLW link
(gradual-disempowerment.ai)

AI As­sis­tants Should Have a Direct Line to Their Developers

Jan_Kulveit28 Dec 2024 17:01 UTC
59 points
6 comments2 min readLW link

A Three-Layer Model of LLM Psychology

Jan_Kulveit26 Dec 2024 16:49 UTC
248 points
17 comments8 min readLW link2 reviews

“Align­ment Fak­ing” frame is some­what fake

Jan_Kulveit20 Dec 2024 9:51 UTC
156 points
13 comments6 min readLW link

“Char­ity” as a con­fla­tion­ary al­li­ance term

Jan_Kulveit12 Dec 2024 21:49 UTC
35 points
2 comments5 min readLW link

Hier­ar­chi­cal Agency: A Miss­ing Piece in AI Alignment

Jan_Kulveit27 Nov 2024 5:49 UTC
115 points
22 comments11 min readLW link