This is a capabilities game. It is neither alignment or safety. To the degree it’s forecasting, it helps cause the thing it forecasts. This has been the standard pattern in capabilities research for a long time: someone makes a benchmark (say, imagenet 1.3m 1000class), and this produces a leaderboard that allows people to show how good their learning algorithm is at novel datasets. In some cases this even produced models directly that were generally useful, but it traditionally was used to show how well an algorithm would work in a new context from scratch. Building benchmarks like this gives teams a new way to brag—they may have a better source of training data (eg, google always had a better source of training data than imagenet), but it allows them to brag that they scored well on the benchmark, which among other things helps them get funding.
Perhaps it also helps convince people to be concerned. That might trade off against this. Perhaps it sucks in some way as a bragging rights challenge. That would trade off against this.
Not sure I agree with you about which way the tradeoff shakes out. To me it seems valuable that people outside the main labs have a clear picture of the capabilities of the leading models, and how that evolves over time, but I see your point that it could also encourage or help capabilities work, which is not my intention.
I’m probably guilty of trying to make the benchmark seem cool and impressive in a way that may not be helpful for what I actually want to achieve with this.
I will think more about this, and read what others have been thinking about it. At the very least I will keep your perspective in mind going forward.
This is a capabilities game. It is neither alignment or safety. To the degree it’s forecasting, it helps cause the thing it forecasts. This has been the standard pattern in capabilities research for a long time: someone makes a benchmark (say, imagenet 1.3m 1000class), and this produces a leaderboard that allows people to show how good their learning algorithm is at novel datasets. In some cases this even produced models directly that were generally useful, but it traditionally was used to show how well an algorithm would work in a new context from scratch. Building benchmarks like this gives teams a new way to brag—they may have a better source of training data (eg, google always had a better source of training data than imagenet), but it allows them to brag that they scored well on the benchmark, which among other things helps them get funding.
Perhaps it also helps convince people to be concerned. That might trade off against this. Perhaps it sucks in some way as a bragging rights challenge. That would trade off against this.
Thank you for your comment!
Not sure I agree with you about which way the tradeoff shakes out. To me it seems valuable that people outside the main labs have a clear picture of the capabilities of the leading models, and how that evolves over time, but I see your point that it could also encourage or help capabilities work, which is not my intention.
I’m probably guilty of trying to make the benchmark seem cool and impressive in a way that may not be helpful for what I actually want to achieve with this.
I will think more about this, and read what others have been thinking about it. At the very least I will keep your perspective in mind going forward.