Quality of work is definitely more relevant than ever at the moment. Duke University near me had a similar issue where one of the most published Scientists (Anil Potti), had epigenetic cancer diagnostic work go all the way to the clinic before other researchers figured out a lot of data had been removed to make the models work. Science has good write-up, on the study. It seemed like a similar case of pressure and wanting to confirm promising results led to a snowball of lies, much like Elizabeth Holmes at Theranos as well. I think in good cause work and competitive environments there is even more temptation to move forward before quality has been double checked.
I would push back a bit on your specific example of double checking code written by Claude by hand. Do you also expect people to read through the Python interpreter code line by line before writing Python scripts? I’m not pushing back against double checking, just that the way things are double checked needs to be scaleable. I don’t know the answer, but maybe it’s something along the lines of handwriting specific tests for the code you are generating.
Quality of work is definitely more relevant than ever at the moment. Duke University near me had a similar issue where one of the most published Scientists (Anil Potti), had epigenetic cancer diagnostic work go all the way to the clinic before other researchers figured out a lot of data had been removed to make the models work. Science has good write-up, on the study. It seemed like a similar case of pressure and wanting to confirm promising results led to a snowball of lies, much like Elizabeth Holmes at Theranos as well. I think in good cause work and competitive environments there is even more temptation to move forward before quality has been double checked.
I would push back a bit on your specific example of double checking code written by Claude by hand. Do you also expect people to read through the Python interpreter code line by line before writing Python scripts? I’m not pushing back against double checking, just that the way things are double checked needs to be scaleable. I don’t know the answer, but maybe it’s something along the lines of handwriting specific tests for the code you are generating.