Wei Dai comments on Daniel Kokotajlo’s Shortform

Wei Dai 9 Feb 2026 1:16 UTC
6 points
2
In my personal vibe-coding projects, I’m reviewing ~0% of code, but having to do a lot more testing (relative to writing the code myself) because the AI is constantly introducing regressions (breaking what previously worked), which are not being caught by its test code because either the spec wasn’t detailed enough to cover every possibility or edge case (i.e., the AI can’t read between the lines to figure out what I want without being written down in detail, or it doesn’t care), or its testing code isn’t good enough to catch a lot of bugs.

As an example, when it adds a new UI element, the styling would often be inconsistent with other nearby elements, and I’d have to tell it to add a requirement to the spec that this set of elements should have consistent styling.

If others have similar experiences, we’ll still have “% testing done by humans” metric to descend after “% of code reviewed by humans” goes to 0.