Definitely, and for mypy where I was having similar issues, but where it’s faster to just rerun, I did add it to pre-commit. But my point was about the broader issue that the LLM was perfectly happy to ignore even very strongly worded “this is esse tial for safety” rules, just for some cavalier expediency, which is obviously a worrying trend, assuming it generalizes. And I think my anecdote was slightly more “real life” than a made up grid game of the original research (although of course way less systematically investigated).
Yes, you are right. But now I wonder: did you try also lessening its general drive to commit? It might have been trained on “hyper-agile” workflows where everything absolutely must be a commit or you are committing the sin of Waterfall or of solo coding or something.
Alongside the safety parts, put stuff in like “it is not essential to commit”, “in my workflow commits are only for finished verified work”, and the like.
I also don’t know how the environment scaffolds its own prompts so I’m guessing wildly.
Definitely, and for mypy where I was having similar issues, but where it’s faster to just rerun, I did add it to pre-commit. But my point was about the broader issue that the LLM was perfectly happy to ignore even very strongly worded “this is esse tial for safety” rules, just for some cavalier expediency, which is obviously a worrying trend, assuming it generalizes. And I think my anecdote was slightly more “real life” than a made up grid game of the original research (although of course way less systematically investigated).
Yes, you are right. But now I wonder: did you try also lessening its general drive to commit? It might have been trained on “hyper-agile” workflows where everything absolutely must be a commit or you are committing the sin of Waterfall or of solo coding or something.
Alongside the safety parts, put stuff in like “it is not essential to commit”, “in my workflow commits are only for finished verified work”, and the like.
I also don’t know how the environment scaffolds its own prompts so I’m guessing wildly.