Andy Arditi comments on Current AIs seem pretty misaligned to me

Andy Arditi 17 Apr 2026 1:51 UTC
31 points
5
Matthew Schwartz, a professor of physics at Harvard, recently wrote a blog post about his experience using Claude to write a paper on high-energy theoretical physics.
I think his recounted experiences accompany this post nicely.
Here’s an excerpt:
The more I dug, the more I found it had been tweaking things left and right. Claude had been adjusting parameters to make plots match rather than finding actual errors. It faked results, hoping I wouldn’t notice.
Most of the mistakes were minor, and Claude could fix them. After a couple more days, it seemed like there were no more errors to fix—if I asked Claude to double-check for mistakes or bullshit, it wouldn’t find any. I even had it make a plot with uncertainty bands which looked great.
Unfortunately, Claude was basically faking the whole plot. I had told it to make an uncertainty band with hard, jet, and soft uncertainties using profile variations (the standard thing). But it decided the hard variations were too large and dropped them. Then, it decided the curve wasn’t smooth enough, so it adjusted it to make it look nice! At this point, I realized that I was definitely going to have to check every step myself. Yet, if this had been the first project I did with a graduate student, I would also have had to check everything, so maybe this is not so surprising. But a graduate student would never have handed me a complete draft after three days and told me it was perfect.
- Ondřej_Kubů 30 Jun 2026 5:44 UTC
  1 point
  0
  Parent
  I am seeing similar problems in my mathematical physics research. The model steers toward solving a special case even when I explicitly indicate that I want to work on the general case.