Thane Ruthenis comments on Implications of the inference scaling paradigm for AI safety

Thane Ruthenis 17 Jan 2025 4:10 UTC
5 points
2
They don’t need to use this kind of subterfuge, they can just directly hire people to do that. Hiring experts to design benchmark questions is standard; this would be no different.
- Nathan Helm-Burger 17 Jan 2025 4:15 UTC
  3 points
  0
  Parent
  Yeah, my comment was mostly being silly. The grain of validity I think is there is that you probably get a much wider weirder set of testing from inviting in a larger and more diverse set of people. And for something like, ‘finding examples of strange failure cases that you yourself wouldn’t have thought of’ then I think diversity of testers matters quite a bit.
  - gwern 20 Jan 2025 19:49 UTC
    10 points
    2
    Parent
    The current FrontierMath fracas is a case in point. Did OpenAI have to keep its sponsorship or privileged access secret? No. Surely there was some amount of money that would pay mathematicians to make hard problems, and that amount was not much different from what they did pay Epoch AI. Did that make life easier? Given the number of mathematician-participants saying they would’ve had second thoughts about participating had they known OA was involved, almost surely.