larry-dial comments on How the NanoGPT Speedrun WR dropped by 20% in 3 months

larry-dial 5 Oct 2025 23:47 UTC
5 points
2
I think we have different viewpoints of what the frontier is. The majority of the 20% improvements mentioned in this post are things I came up with and are pretty surface level. I have only been looking at LLMs for 6 months when I have free time outside work as something to tinker with, and I don’t consider myself an expert, obviously. I would anticipate that the actual research frontier at labs is substantially ahead, such that any moral discussions around this post are akin to debating if a 11th grade Chemistry lab will encourage the creation of nuclear weapons.
I don’t think you’re doing a very good job of understanding capabilities
Part of my hope in posting was to get technical feedback from a crowd that is knowledgeable on AI systems. Curious if you can be more specific on why you believe this.
- faul_sname 6 Oct 2025 3:38 UTC
  2 points
  2
  Parent
  AI development feels more similar to biology than to chemistry. Bright 11th graders shouldn’t be doing experiments on culturing some previously unculturabke pathogen which would be a good bioweapon target and discussing their results, since the field is wide and shallow and it’s not entirely impossible that their experiments are novel. On the other hand, if they’re running basic experiments on culturing some specific common bacterium (e.g. e coli) better, they probably don’t need to worry about accelerating bioweapon development even if there is a chance of them making a slight advancement to the field of biology as a whole.
  
  The nanogpt speedrun feels more like developing better methods to culture e coli at a hobbyist level, and quite unlikely to lead to any substantial advancement applicable to the operational efficiency of well-funded companies at the frontier. Still, it probably is worth keeping track of when the work you’re doing approaches the “this is actually something novel the frontier labs might use” mark, particularly if it’s something more substantial than “here’s how to use the hardware more efficiently to train this particular model”.