I definitely don’t see a problem with taking lab funding as a safety org. (As long as you don’t claim otherwise.)
I definitely don’t have a problem with this as well—just that this needs to be much more transparent and carefully though-out than how it happened here.
If you think they didn’t train on FrontierMath answers, why do you think having the opportunity to validate on it is such a significant advantage for OpenAI?
My concern is that “verbally agreeing to not use it for training” leaves a lot of opportunities to still use it as a significant advantage. For instance, do we know that they did not use it indirectly to validate a PRM that could in turn help a lot? I don’t think making a validation set out of their training data would be as effective.
Re: “maybe it would have took OpenAI a bit more time to contract some mathematicians, but realistically, how much more time?”: Not much, they might have done this indepently as well. (assuming the mathematicians they’d contact would be equally willing to contribute directly to OpenAI)
I definitely don’t have a problem with this as well—just that this needs to be much more transparent and carefully though-out than how it happened here.
My concern is that “verbally agreeing to not use it for training” leaves a lot of opportunities to still use it as a significant advantage. For instance, do we know that they did not use it indirectly to validate a PRM that could in turn help a lot? I don’t think making a validation set out of their training data would be as effective.
Re: “maybe it would have took OpenAI a bit more time to contract some mathematicians, but realistically, how much more time?”: Not much, they might have done this indepently as well. (assuming the mathematicians they’d contact would be equally willing to contribute directly to OpenAI)