how do people come to agree that “X is a good researcher”[?] [...] Pointing in the same direction involves a lot of agreeing on “this is progress” vs “this is not progress”. (There is also an object-level of “and whatever you agree is progress points in the direction of reality itself”)
To what extent do alignment researchers agree on who is a good researcher/what progress is? I’d guess there’s a bunch of disagreement there, even amongst researchers who agree the problem is hard e.g. Eliezer vs Paul vs Steven. And I can think of relatively few cases of progress on alignment in my view, let alone anyone elses. (TurnTrout’s work on Power/Insturmental Convergence, and Stuart’s work on value indifference in case you’re wondering). Likewise for what the hard parts of the problem are. I’m not confident there’ll be that much disagreement. My reasoning is that a lot of disagreements look strong but aren’t really. Say, whether some probability is 0.1 or 0.9 isn’t that big a difference.
EXPERIMENT: To test whether consensus on progress points in the direction of reality, check what N year old results are most commonly considered progress now, and see how much researchers thought this result made progress M years ago. Of course you’d have to use proxy measures in most cases, e.g. karma and citations.
To what extent do alignment researchers agree on who is a good researcher/what progress is? I’d guess there’s a bunch of disagreement there, even amongst researchers who agree the problem is hard e.g. Eliezer vs Paul vs Steven. And I can think of relatively few cases of progress on alignment in my view, let alone anyone elses. (TurnTrout’s work on Power/Insturmental Convergence, and Stuart’s work on value indifference in case you’re wondering). Likewise for what the hard parts of the problem are. I’m not confident there’ll be that much disagreement. My reasoning is that a lot of disagreements look strong but aren’t really. Say, whether some probability is 0.1 or 0.9 isn’t that big a difference.
EXPERIMENT: To test whether consensus on progress points in the direction of reality, check what N year old results are most commonly considered progress now, and see how much researchers thought this result made progress M years ago. Of course you’d have to use proxy measures in most cases, e.g. karma and citations.