[Question] Could transformer network models learn motor planning like they can learn language and image generation?

Humans are really good at moving in space. We do motor planning and especially error correction much better than existing robot control systems.

Could a transformer model trained on a large library of annotated human movement data serve as a controller for a humanoid robot or robotic limb? My impression is that a movement model might not be useful for directly controlling servos, because human and robot bodies are so different. But perhaps it could improve the motor planning and error correction layer?

The data would presumably take the form of a 3D wireframe model of the limb/​body and its trajectory through space, the goal of the movement (“pour water from this cup to that cup”) and some rating of success or failure.

I don’t have experience in either LLMs/​transformer models or robotics so this question might miss some obvious points, but I couldn’t get the idea out of my head!

No comments.