You probably don’t mean dangerous capabilities evals, right? I mean, I do feel hesitant even about those. I would really not want someone using my work on WMDP to increase their model’s ability to make bioweapons.
In Connor Leahy’s recent interview on Trajectory he argues that scientists making evals are being “used” as tools by the AI corporations in a similar way to how cancer researchers were used by cigarette companies to throw confusion into the path of concluding cigarettes cause cancer.
With bioweapons evals at least the profit motive of AI companies is aligned with the common interest here; a big benefit of your work comes from when companies use it to improve their product. I’m not at all confused about why people would think this is useful safety work, even if I haven’t personally hashed out the cost/benefit to any degree of confidence.
I’m mostly confused about ML / SWE / research benchmarks.
I’m not sure but I have a guess.
A lot of “normies” I talk to in the tech industry are anchored hard on the idea that AI is mostly a useless fad and will never get good enough to be useful.
They laugh off any suggestions that the trends point towards rapid improvements that can end up with superhuman abilities. Similarly, completely dismiss arguments that AI might used for building better AI. ‘Feed the bots their own slop and they’ll become even dumber than they already are!’
So, people who do believe that the trends are meaningful, and that we are near to a dangerous threshold, want some kind of proof to show the doubters. They want people to start taking this seriously before it’s too late.
I do agree that the targeting of benchmarks by capabilities developers is totally a thing. The doubting-Thomases of the world are also standing in the way of the capabilities folks of getting the cred and funding they desire. A benchmark designed specifically to convince doubters is a perfect tool for… convincing doubters who might then fund you and respect you.
I’m really getting annoyed by AI safety people making analogies towards things that had way more evidence than the AI risk field ever got, and it also happens with comparisons to climate change.
You probably don’t mean dangerous capabilities evals, right? I mean, I do feel hesitant even about those. I would really not want someone using my work on WMDP to increase their model’s ability to make bioweapons.
In Connor Leahy’s recent interview on Trajectory he argues that scientists making evals are being “used” as tools by the AI corporations in a similar way to how cancer researchers were used by cigarette companies to throw confusion into the path of concluding cigarettes cause cancer.
With bioweapons evals at least the profit motive of AI companies is aligned with the common interest here; a big benefit of your work comes from when companies use it to improve their product. I’m not at all confused about why people would think this is useful safety work, even if I haven’t personally hashed out the cost/benefit to any degree of confidence.
I’m mostly confused about ML / SWE / research benchmarks.
I’m not sure but I have a guess. A lot of “normies” I talk to in the tech industry are anchored hard on the idea that AI is mostly a useless fad and will never get good enough to be useful.
They laugh off any suggestions that the trends point towards rapid improvements that can end up with superhuman abilities. Similarly, completely dismiss arguments that AI might used for building better AI. ‘Feed the bots their own slop and they’ll become even dumber than they already are!’
So, people who do believe that the trends are meaningful, and that we are near to a dangerous threshold, want some kind of proof to show the doubters. They want people to start taking this seriously before it’s too late.
I do agree that the targeting of benchmarks by capabilities developers is totally a thing. The doubting-Thomases of the world are also standing in the way of the capabilities folks of getting the cred and funding they desire. A benchmark designed specifically to convince doubters is a perfect tool for… convincing doubters who might then fund you and respect you.
I’m really getting annoyed by AI safety people making analogies towards things that had way more evidence than the AI risk field ever got, and it also happens with comparisons to climate change.