thanks for concrete examples, can you help me understand how these translate from individual productivity to externally-observable productivity?
3 days to make a medium sized project
I agree Docker setup can be fiddly, however what happened with the 50+% savings—did you lower price for the customer to stay competitive, do you do 2x as many paid projects now, or did you postpone hiring another developer who is not needed now, or do you just have more free time? No change in support&maintenance costs compared to similar projects before LLMs?
processing isn’t more than ~500 lines of code
oh well, my only paid experience is with multi-year project development&maintenance, those are definitelly not in the category under 1kloc 🙈 which might help to explain my abysmal experience trying to use any AI tools for work (beyond autocomplete, but IntelliSense also existed before LLMs)
TBH, I am now moving towards the opinion that evals are very un-representative of the “real world” (if we exclude LLM wrappers as requested in the OP … though LLM wrappers including evals are becoming part of the “real world” too, so I don’t know—it’s like banking bootstrapped wealthy bankers, and LLM wrappers might be bootstraping wealthy LLM startups)
I can do more projects in parallel than I could have before. Which means that I have even more work now… The support and maintenance costs of the code itself are the same, as long as you maintain constant vigilance to make sure nothing bad gets merged. So the costs are moved from development to review. It’s a lot easier to produce thousands of lines of slop which then have to be reviewed and loads of suggestions made. It’s easy for bad taste to be amplified, which is a real cost that might not be noticed that much.
There are some evals which work on large codebases (e.g. “fix this bug in django”), but those are the minority, granted. They can help with the scaffolding, though—those tend to be large projects in which a Claude can help find things.
But yeah, large files are ok if you just want to find something, but somewhere under 500 loc seems to be the limit of what will work well. Though you can get round it somewhat by copying the parts to be changed to a different file then copying them back, or other hacks like that...
thanks for concrete examples, can you help me understand how these translate from individual productivity to externally-observable productivity?
I agree Docker setup can be fiddly, however what happened with the 50+% savings—did you lower price for the customer to stay competitive, do you do 2x as many paid projects now, or did you postpone hiring another developer who is not needed now, or do you just have more free time? No change in support&maintenance costs compared to similar projects before LLMs?
oh well, my only paid experience is with multi-year project development&maintenance, those are definitelly not in the category under 1kloc 🙈 which might help to explain my abysmal experience trying to use any AI tools for work (beyond autocomplete, but IntelliSense also existed before LLMs)
TBH, I am now moving towards the opinion that evals are very un-representative of the “real world” (if we exclude LLM wrappers as requested in the OP … though LLM wrappers including evals are becoming part of the “real world” too, so I don’t know—it’s like banking bootstrapped wealthy bankers, and LLM wrappers might be bootstraping wealthy LLM startups)
I can do more projects in parallel than I could have before. Which means that I have even more work now… The support and maintenance costs of the code itself are the same, as long as you maintain constant vigilance to make sure nothing bad gets merged. So the costs are moved from development to review. It’s a lot easier to produce thousands of lines of slop which then have to be reviewed and loads of suggestions made. It’s easy for bad taste to be amplified, which is a real cost that might not be noticed that much.
There are some evals which work on large codebases (e.g. “fix this bug in django”), but those are the minority, granted. They can help with the scaffolding, though—those tend to be large projects in which a Claude can help find things.
But yeah, large files are ok if you just want to find something, but somewhere under 500 loc seems to be the limit of what will work well. Though you can get round it somewhat by copying the parts to be changed to a different file then copying them back, or other hacks like that...