It seems to me like one (often obscured) reason for the disagreement between Thomas and Habryka is that they are thinking about different groups of people when they define “the field.”
To assess the % of “the field” that’s doing meaningful work, we’d want to do something like [# of people doing meaningful work]/[total # of people in the field].
Who “counts” in the denominator? Should we count anyone who has received a grant from the LTFF with the word “AI safety” in it? Only the ones who have contributed object-level work? Only the ones who have contributed object-level work that passes some bar? Should we count the Anthropic capabilities folks? Just the EAs who are working there?
My guess is that Thomas was using more narrowly defined denominator (e.g., not counting most people who got LTFF grants and went off to to PhDs without contributing object-level alignment stuff; not counting most Anthropic capabilities researchers who have never-or-minimally engaged with the AIS community) whereas Habryka was using a more broadly defined denominator.
I’m not certain about this, and even if it’s true, I don’t think it explains the entire effect size. But I wouldn’t be surprised if roughly 10-30% of the difference between Thomas and Habryka might come from unstated assumptions about who “counts” in the denominator.
(My guess is that this also explains “vibe-level” differences to some extent. I think some people who look out into the community and think “yeah, I think people here are pretty reasonable and actually trying to solve the problem and I’m impressed by some of their work” are often defining “the community” more narrowly than people who look out into the community and think “ugh, the community has so much low-quality work and has a bunch of people who are here to gain influence rather than actually try to solve the problem.”)
This sounds like a solid explanation for the difference for someone totally uninvolved with the Berkeley scene.
Though I’m surprised there’s no broad consensus on even basic things like this in 2023.
In game terms, if everyone keeps their own score separately then it’s no wonder a huge portion of effort will, in aggregate, go towards min-maxing the score tracking meta-game.
It seems to me like one (often obscured) reason for the disagreement between Thomas and Habryka is that they are thinking about different groups of people when they define “the field.”
To assess the % of “the field” that’s doing meaningful work, we’d want to do something like [# of people doing meaningful work]/[total # of people in the field].
Who “counts” in the denominator? Should we count anyone who has received a grant from the LTFF with the word “AI safety” in it? Only the ones who have contributed object-level work? Only the ones who have contributed object-level work that passes some bar? Should we count the Anthropic capabilities folks? Just the EAs who are working there?
My guess is that Thomas was using more narrowly defined denominator (e.g., not counting most people who got LTFF grants and went off to to PhDs without contributing object-level alignment stuff; not counting most Anthropic capabilities researchers who have never-or-minimally engaged with the AIS community) whereas Habryka was using a more broadly defined denominator.
I’m not certain about this, and even if it’s true, I don’t think it explains the entire effect size. But I wouldn’t be surprised if roughly 10-30% of the difference between Thomas and Habryka might come from unstated assumptions about who “counts” in the denominator.
(My guess is that this also explains “vibe-level” differences to some extent. I think some people who look out into the community and think “yeah, I think people here are pretty reasonable and actually trying to solve the problem and I’m impressed by some of their work” are often defining “the community” more narrowly than people who look out into the community and think “ugh, the community has so much low-quality work and has a bunch of people who are here to gain influence rather than actually try to solve the problem.”)
This sounds like a solid explanation for the difference for someone totally uninvolved with the Berkeley scene.
Though I’m surprised there’s no broad consensus on even basic things like this in 2023.
In game terms, if everyone keeps their own score separately then it’s no wonder a huge portion of effort will, in aggregate, go towards min-maxing the score tracking meta-game.