Adam Karvonen comments on Frontier AI Models Still Fail at Basic Physical Tasks: A Manufacturing Case Study

Adam Karvonen 15 Apr 2025 14:42 UTC
8 points
0
This is an obvious step, but I’m a bit skeptical for a few reasons.
- Current models are just so bad at vision tasks. Even Gemini 2.5 is pretty bad and falls apart if pushed to harder images. It really seems like identifying a feature on a part or if a part is symmetric is something that could be addressed by just scaling data, and these vision tasks are much easier than manufacturing details.
- A lot of the work in manufacturing / construction would be in tactile details, which could be hard to capture with sensors. For example, a human finger can easily feel a step of 0.001 inches, which would be invisible on video, and I would often use this fine grained tactile detail when diagnosing problems.
- The current reasoning paradigm requires scaling up RL. Where is the reward signal here? The most obvious thing I can think of is creating a bunch of simulated environments. But, almost all machinists I’ve talked to have (completely valid) complaints about engineers that understand textbook formulas and CAD but don’t understand real world manufacturing constraints. Simulation environments seem likely to create AIs with the same shortcomings.
- SorenJ 15 Apr 2025 15:03 UTC
  2 points
  0
  Parent
  1. They’re pretty bad, but they seem about GPT-2 level bad? So plausibly in a couple of years they will be GPT-4 level good, if things go the same way?
  2. This does seem pretty difficult. The only idea I have is having humans wear special gloves with sensors on them, and maybe explain their thoughts aloud as they work, and then collecting all of this data.
  3. Before you go to RL you need to train on prediction with a large amount of data first. We don’t have this yet for blue collar work. Then once you have the prediction model, robots, and rudimentary agents, you try to get the robots to do simple tasks in isolated environments. If they succeed they get rewarded. This feels quite a bit more than 3 years away...
  In general, I think the ideas is that you first get a superhuman coder, then you get a superhuman AI researcher, then you get a any-task superhuman researcher, and then you use this superhuman researcher to solve all of the problems we have been discussing in lightning fast time.
  - Adam Karvonen 15 Apr 2025 15:15 UTC
    4 points
    1
    Parent
    Yeah, I agree. I currently feel like our current ML approach is going to make very little real world manufacturing progress, and that any progress will have to come from the automated AI researcher either brute forcing tons of synthetic data or coming up with new architectures and training procedures.
    
    But, this is a low confidence take, and I wouldn’t be shocked if a couple dumb tricks make a lot of progress.