Ha, and I have been writing up a long-form for when AI-coded-GOFAI might become effective, one might even say unreasonably effective. LLMs aren’t very good at learning in environments with very few data samples, such as “learning on the job” or interacting with the slow real world. But there often exist heuristics, ones that are difficult to run on a neural net, with excellent specificity that are capable of proving their predictive power with a small number of examples. You can try to learn the position of the planets by feeding 10,000 examples into a neural network, but you’re much better off with Newton’s laws coded into your ensemble. Data constrained environments (like, again, robots and learning on the job) are domains where the bitter lesson might not have bite.
As a former robotics developer, I feel the bitter lesson in my bones. This is actually one of the points I plan to focus on when I write up the longer version of my argument.
High-quality manual dexterity (and real-time visual processing) in a cluttered environment is a heartbreakingly hard problem, using any version of GOFAI techniques I knew at the time. And even the most basic of the viable algorithms quickly turned into a big steaming pile of linear algebra mixed with calculus.
As someone who has done robotics demos (and who knows all the things an engineer can do to make sure the demos go smoothly), the Figure AI groceries demostill blows my mind. This demo is well into the “6 impossible things before breakfast” territory for me, and I am sure as hell feeling the imminent AGI when I watch it. And I think this version of Figure was an 8B VLLM connected to an 80M specialized motor control model running at 200 Hz? Even if I assume that this is a very carefully run demo showing Figure under ideal circumstances, it’s still black magic fuckery for me.
But it’s really hard to communicate this intuitive reaction to someone who hasn’t spent years working on GOFAI robotics. Some things seem really easy until you actually start typing code into an editor and booting it on actual robot hardware, or until you start trying to train a model. And then these things reveal themselves as heartbreakingly difficult. And so when I see VLLM-based robots that just casually solve these problems, I remember years of watching frustrated PhDs struggle with things that seemed impossibly basic.
For me, “fix a leaky pipe under a real-world, 30-year-old sink without flooding the kitchen, and deal with all the weird things that inevitably go wrong” will be one of my final warning bells of imminent general intelligence. Especially if the same robot can also add a new breaker to the electrical panel and install a new socket in an older house.
Ha, and I have been writing up a long-form for when AI-coded-GOFAI might become effective, one might even say unreasonably effective.
LLMs aren’t very good at learning in environments with very few data samples, such as “learning on the job” or interacting with the slow real world. But there often exist heuristics, ones that are difficult to run on a neural net, with excellent specificity that are capable of proving their predictive power with a small number of examples. You can try to learn the position of the planets by feeding 10,000 examples into a neural network, but you’re much better off with Newton’s laws coded into your ensemble. Data constrained environments (like, again, robots and learning on the job) are domains where the bitter lesson might not have bite.
As a former robotics developer, I feel the bitter lesson in my bones. This is actually one of the points I plan to focus on when I write up the longer version of my argument.
High-quality manual dexterity (and real-time visual processing) in a cluttered environment is a heartbreakingly hard problem, using any version of GOFAI techniques I knew at the time. And even the most basic of the viable algorithms quickly turned into a big steaming pile of linear algebra mixed with calculus.
As someone who has done robotics demos (and who knows all the things an engineer can do to make sure the demos go smoothly), the Figure AI groceries demo still blows my mind. This demo is well into the “6 impossible things before breakfast” territory for me, and I am sure as hell feeling the imminent AGI when I watch it. And I think this version of Figure was an 8B VLLM connected to an 80M specialized motor control model running at 200 Hz? Even if I assume that this is a very carefully run demo showing Figure under ideal circumstances, it’s still black magic fuckery for me.
But it’s really hard to communicate this intuitive reaction to someone who hasn’t spent years working on GOFAI robotics. Some things seem really easy until you actually start typing code into an editor and booting it on actual robot hardware, or until you start trying to train a model. And then these things reveal themselves as heartbreakingly difficult. And so when I see VLLM-based robots that just casually solve these problems, I remember years of watching frustrated PhDs struggle with things that seemed impossibly basic.
For me, “fix a leaky pipe under a real-world, 30-year-old sink without flooding the kitchen, and deal with all the weird things that inevitably go wrong” will be one of my final warning bells of imminent general intelligence. Especially if the same robot can also add a new breaker to the electrical panel and install a new socket in an older house.
The robots didn’t open the eggs box and individually put them in the rack inside the fridge, obviously crap, not buying the hype. /s