I think what’s going on is that large language models are trained to “sound smart” in a live conversation with users, and so they prefer to highlight possible problems instead of confirming that the code looks fine, just like human beings do when they want to sound smart.
I have encountered this before and what worked for me was telling the model “point out all the mistakes, but then review them and decide which of them if any are worth highlighting.”
That way the model gets to sound doubly smart: I found a mistake and also I understand the circumstances enough not to raise it to high priority.
I have encountered this before and what worked for me was telling the model “point out all the mistakes, but then review them and decide which of them if any are worth highlighting.”
That way the model gets to sound doubly smart: I found a mistake and also I understand the circumstances enough not to raise it to high priority.