When this question was posted, I asked myself, what would be a “cynical” answer? What that means is, you ask yourself: given what I see and know, what would be a realistically awful state of affairs? So, not catastrophizing, but also having low expectations.
What my intuition came up with was, less than 10% working on user-centered alignment, and less than 1% on user-independent alignment. But I didn’t have the data to check those estimates against (and I also knew there would be issues of definition).
So let me try to understand your guesses. In my terminology, you seem to be saying:
1000 (600+400) doing AI safety work
600 doing work that relates to alignment
80 doing work on scalable user-centered alignment
80 (40+40) doing work on user-independent alignment
Sure, that’s one interpretation. If people are working on dual-use technology that’s mostly being used for profit but might sometimes contribute to alignment, I tend to not count them as “doing AI safety work,” but it’s really semantics.
When this question was posted, I asked myself, what would be a “cynical” answer? What that means is, you ask yourself: given what I see and know, what would be a realistically awful state of affairs? So, not catastrophizing, but also having low expectations.
What my intuition came up with was, less than 10% working on user-centered alignment, and less than 1% on user-independent alignment. But I didn’t have the data to check those estimates against (and I also knew there would be issues of definition).
So let me try to understand your guesses. In my terminology, you seem to be saying:
1000 (600+400) doing AI safety work
600 doing work that relates to alignment
80 doing work on scalable user-centered alignment
80 (40+40) doing work on user-independent alignment
Sure, that’s one interpretation. If people are working on dual-use technology that’s mostly being used for profit but might sometimes contribute to alignment, I tend to not count them as “doing AI safety work,” but it’s really semantics.