I feel nervous about the terminology. I think that terminology can presuppose some specific assumptions about how this should or will play out, that I don’t think are likely.
“automating alignment research” → I know this has been used before, it sounds very high-level to me. Like saying that all software used as part of financial trading workflows is “automating financial trading.” I think it’s much easier to say that software is augmenting financial trading or similar. There’s not one homogeneous thing called “financial trading,” the term typically emphasises the parts that aren’t yet automated. The specific ways it’s integrated sometimes involve it replacing entire people, sometimes involve it helping people, and often does both in complex ways.
“Algorithmic trading was not just about creating faster digital traders but about reimagining traders as fleets of bots, quants, engineers, and other specialists.” In software, the word fleet sometimes refers to specific deployment strategies. A whole lot of the automation doesn’t look like “bots”—rather it’s a lot of regular tools, plug-ins, helpers, etc.
”vast digital fleets of specialized AI agents working in concert” This is one architecture we can choose, but I’m not sure how critical/significant it will be. I very much agree that AI will be a big deal, but this makes it sound like you’re assuming a specific way for AI to be used.
All that said, I’m very much in favor of us taking a lot of advantage of AI systems for all the things we want in the world, including AI safety. I imagine that for AI safety, we’ll probably use a very eccentric and complex mix of AI technologies. Some with directly replace some existing researchers, we’ll have specific scripts for research experiments, maybe agent-like things that do ongoing oversight, etc.
It’s possible that from the authors perspective, the specific semantic meanings I took from terms like “automated alignment research” and “fleets” wasn’t implied. But if I made the mistake, I’m sure other readers will as well, so I’d like to encourage changes here before these phrases take off much further (if others agree with my take.)
I’m happy this area is getting more attention.
I feel nervous about the terminology. I think that terminology can presuppose some specific assumptions about how this should or will play out, that I don’t think are likely.
“automating alignment research” → I know this has been used before, it sounds very high-level to me. Like saying that all software used as part of financial trading workflows is “automating financial trading.” I think it’s much easier to say that software is augmenting financial trading or similar. There’s not one homogeneous thing called “financial trading,” the term typically emphasises the parts that aren’t yet automated. The specific ways it’s integrated sometimes involve it replacing entire people, sometimes involve it helping people, and often does both in complex ways.
“Algorithmic trading was not just about creating faster digital traders but about reimagining traders as fleets of bots, quants, engineers, and other specialists.”
In software, the word fleet sometimes refers to specific deployment strategies. A whole lot of the automation doesn’t look like “bots”—rather it’s a lot of regular tools, plug-ins, helpers, etc.
”vast digital fleets of specialized AI agents working in concert”
This is one architecture we can choose, but I’m not sure how critical/significant it will be. I very much agree that AI will be a big deal, but this makes it sound like you’re assuming a specific way for AI to be used.
All that said, I’m very much in favor of us taking a lot of advantage of AI systems for all the things we want in the world, including AI safety. I imagine that for AI safety, we’ll probably use a very eccentric and complex mix of AI technologies. Some with directly replace some existing researchers, we’ll have specific scripts for research experiments, maybe agent-like things that do ongoing oversight, etc.
It’s possible that from the authors perspective, the specific semantic meanings I took from terms like “automated alignment research” and “fleets” wasn’t implied. But if I made the mistake, I’m sure other readers will as well, so I’d like to encourage changes here before these phrases take off much further (if others agree with my take.)