While there may be problems with what I have suggested, I do not think the scenario you describe is a relevant consideration for the following reasons...
As you describe it the ai is still required to make a cheese cake, it just makes a poor one.
It should not take more than an hour to make a cheese cake, and the ai is optimizing for time. Also the person may eat some cheese cake after it is made, so the ai must produce the virus, infect the person, and have the virus alter the person’s mind within 1 hour while making a poor cheese cake.
Whatever resources the ai expends on the virus must be less then the added cost of making a reasonably good cheese cake rather than a poor one.
The legal system only has to identify a property law violation, which producing a virus and infecting people would be, so the virus must be undetected for more than 1 year.
Since it is of no benefit to the ai if the virus kills people, the virus must by random chance kill people as a totally incidental side effect.
I would not claim that it is completely impossible for this to produce a virus leading to human extinction, and some have declared any probability of human extinction to effectively be of negative infinite utility, but I do not think this is reasonable, since there is always some probability of human extinction, and moreover I do not think the scenario you describe contributes significantly to that.
Theorems are not generally presented in math journals in the way they were discovered, so I am not sure machine learning from journal articles would greatly help in discovery. The issue is really that going from question to answer is a different process from verifying an answer is correct, or guiding a reader through such a verification which is what a proof is.
A perhaps less lofty, but still incredibly useful, goal would be automating a process for simplifying proofs
Or alternatively convincing mathematicians to narrate their own mental process of discovery.