In my experience, semgrep does not play well with trying to find cross-class behavior in dynamically typed codebases with lots of dependency injection, which is why I was trying to make Claude write some code which combined static analysis (in the form of reflection or ast parsing) with runtime logic for gathering information which is hard to determine statically but easy to determine at runtime.
For reference the code I ended up writing for this part was about 40 lines, it wasn’t very complicated. Trying to do it in full generality purely by static analysis would be insanely complex (because php has terrible constructs like $$foo = "bar" and $foo->$bar = "baz", which this codebase doesn’t use and can be trivially verified not to use, but which would be a nightmare to handle if they were used), but fortunately this wasn’t what I needed.
But yeah, I also expected Claude to be able to do this trivially. It is able to trivially do most tasks which feel, to me, to be about this difficult or even a bit more difficult. This task felt like it should have been easier, since it’s one where there’s a lot of available signal to self-correct if you make a mistake, much more so than for many of the “build and test a feature” style tasks that Claude regularly does with no drama. Which is why I thought it would be a good example for a post along the lines of “many people use LLMs to quickly add sloppy features to their codebase, increasing technical debt, but it’s also possible to use them to resolve technical debt much faster than doing it by hand”. And then I tried it.
In my experience, semgrep does not play well with trying to find cross-class behavior in dynamically typed codebases with lots of dependency injection, which is why I was trying to make Claude write some code which combined static analysis (in the form of reflection or ast parsing) with runtime logic for gathering information which is hard to determine statically but easy to determine at runtime.
For reference the code I ended up writing for this part was about 40 lines, it wasn’t very complicated. Trying to do it in full generality purely by static analysis would be insanely complex (because php has terrible constructs like
$$foo = "bar"and$foo->$bar = "baz", which this codebase doesn’t use and can be trivially verified not to use, but which would be a nightmare to handle if they were used), but fortunately this wasn’t what I needed.But yeah, I also expected Claude to be able to do this trivially. It is able to trivially do most tasks which feel, to me, to be about this difficult or even a bit more difficult. This task felt like it should have been easier, since it’s one where there’s a lot of available signal to self-correct if you make a mistake, much more so than for many of the “build and test a feature” style tasks that Claude regularly does with no drama. Which is why I thought it would be a good example for a post along the lines of “many people use LLMs to quickly add sloppy features to their codebase, increasing technical debt, but it’s also possible to use them to resolve technical debt much faster than doing it by hand”. And then I tried it.