Ram Potham comments on Recent AI model progress feels mostly like bullshit

Ram Potham 30 Mar 2025 16:58 UTC
2 points
0
I have experienced similar problems to you when building an AI tool—better models did not necessarily lead to better performance despite external benchmarks. I believe there are 2 main reasons why this is, alluded to in your post:
1. Selection Bias—when a foundation model company releases their newest model, they show performance on benchmarks most favorable to it
2. Alignment—You mentioned how AI is not truly understanding the instructions you meant. While this can be mitigated by creating better prompts, it does not fully solve the issue