I think the main challenge to monitoring agents is one of volume/scale, re:
Agents are much cheaper and faster to run than humans, so the amount of information and interactions that humans need to oversee will drastically increase.
But 1, 2, 4 are issues humans already face with managing other humans.
A decent chunk of people are “RL’d” into doing the bare minimum and playing organizational politics to optimize their effort:reward ratio.
Many employees are treated as fungible and are given limited context into their entire org. Even in a fully transparent org, if it’s large enough, then it’s impossible for everything to be in an individual’s context anyway.
People fail all the time in quite surprising and subtle ways! We’re just really bad at documenting and noticing most of them since we usually only explore and share really catastrophic failures. If someone loses ~$100, it’s whatever. When someone loses ~$100M dollars, it’s a lawsuit.
This is a long winded way of saying: I’m optimistic we can address managing agents (with their current capabilities) by drawing analogies to how we already do management in effective organizations, and find avenues to scale these management strategies 100x (or however many OOMs one might think we’ll need).
I think the main challenge to monitoring agents is one of volume/scale, re:
But 1, 2, 4 are issues humans already face with managing other humans.
A decent chunk of people are “RL’d” into doing the bare minimum and playing organizational politics to optimize their effort:reward ratio.
Many employees are treated as fungible and are given limited context into their entire org. Even in a fully transparent org, if it’s large enough, then it’s impossible for everything to be in an individual’s context anyway.
People fail all the time in quite surprising and subtle ways! We’re just really bad at documenting and noticing most of them since we usually only explore and share really catastrophic failures. If someone loses ~$100, it’s whatever. When someone loses ~$100M dollars, it’s a lawsuit.
This is a long winded way of saying: I’m optimistic we can address managing agents (with their current capabilities) by drawing analogies to how we already do management in effective organizations, and find avenues to scale these management strategies 100x (or however many OOMs one might think we’ll need).