Here’s something odd that I noticed in one of the examples in the blogpost (https://ai.googleblog.com/2022/06/minerva-solving-quantitative-reasoning.html).
The question is the one that in part reads “the variance of the first n natural numbers is 10”. The model’s output states, without any reasoning, that this variance is equal to (n^2 − 1)/12, which is correct. Since no reasoning was used, I think it’s safe to assume that the model memorized this formula.
This is not a formula that a random math student would be expected to have memorized. (Anecdotally, I have a mathematics degree and don’t know it.) Because of that, I’d expect that a typical (human) solver would need to derive the formula on the spot. It also strikes me as the sort of knowledge that would be unlikely to matter outside a contest, exam, etc.
That all leads me to think that the model might be over-fitting somewhat to contest/exam/etc.-style questions. By that I mean that it might be memorizing facts that are useful when answering such questions but are not useful when doing math more broadly.
To be clear, there are other aspects of the model output, here and in other questions, that seem genuinely impressive in terms of reasoning ability. But the headline accuracy rate might be inflated by memorization.
As someone who is pro-open-source, I do think that “AI isn’t useful for making bioweapons” is ultimately a losing argument, because AI is increasingly helpful at doing many different things, and I see no particular reason that the making-of-bioweapons would be an exception. However, that’s also true of many other technologies: good luck making your bioweapon without electric lighting, paper, computers, etc. It wouldn’t be reasonable to ban paper just because it’s handy in the lab notebook in a bioweapons lab.
What would be more persuasive is some evidence that AI is relatively more useful for making bioweapons than it is for doing things in general. It’s a bit hard for me to imagine that being the case, so if it turned out to be true, I’d need to reconsider my viewpoint.