Experiments in instrumental convergence

This sequence investigates instrumental convergence and power-seeking through a series of experiments in multi-agent RL.

The key question we explore: If humans build AIs that learn faster than we do, will those AIs compete with us by default?

In­stru­men­tal con­ver­gence in sin­gle-agent systems

Misal­ign­ment-by-de­fault in multi-agent systems

In­stru­men­tal con­ver­gence: scale and phys­i­cal interactions

POWER­play: An open-source toolchain to study AI power-seeking