ACCount comments on Frontier AI Models Still Fail at Basic Physical Tasks: A Manufacturing Case Study

ACCount 15 Apr 2025 9:21 UTC
−10 points
−10
This post feels at least 80% AI generated. That aside, which part of “AI that was never trained or optimized to do machining makes mistakes in machining” is surprising, exactly?
Mainstream LLMs are not trained to perform physical tasks out in the real world at all. It doesn’t matter how cutting edge it is—you can’t expect to cram an off the shelf LLM into a robot body and have it perform well. It took a lot of reinforcement learning elbow grease to get AIs to be any good at math or coding—and robotics companies are now having to do a lot of specialized training to get robot AI that’s competent at tasks on the level of “pick up that can”. Let alone physical reverse engineering or complex manufacturing operations.
That doesn’t mean that we wouldn’t get an AI that’s superhuman at machining in a few years. Or a few weeks. It’s just stupid to expect today’s mainstream AIs to be there already.
- Neel Nanda 15 Apr 2025 10:19 UTC
  11 points
  4
  Parent
  Modern LLMs are good at a lot of things they weren’t explicitly trained for. I was surprised by the negative results in this post
  - Davidmanheim 15 Apr 2025 13:26 UTC
    7 points
    2
    Parent
    Yeah, I’m only unsurprised because I’ve been tracking other visual reasoning tasks and already updated towards verbal intelligence of LLMs being pretty much disconnected from spatial and similar reasoning. (But the visual classes of task seem not obviously harder, and visual data generation is very feasible at scale, so I do expect reasonably rapid future progress now that it is a focus, conditional on sufficient attention from developers.)
    - Neel Nanda 15 Apr 2025 20:36 UTC
      2 points
      0
      Parent
      Ah thanks for clarifying! That’s very reasonable