The “use uncommon tools” example is familiar. Last year, I was really amazed by what Claude/Cursor could do in primary coding tasks, then appalled by how poorly that transferred to asking it to work with Jupyter/iPython notebooks via MCP. We’d been working on a notebook for 30 min, then it would screw up the tool call, conclude the notebook had been deleted, and attempt to create it fresh. This happened repeatedly. It’s just not the kind of mistake a human would make, which gets back to, how exactly do these minds work and form models of the world?
The “use uncommon tools” example is familiar. Last year, I was really amazed by what Claude/Cursor could do in primary coding tasks, then appalled by how poorly that transferred to asking it to work with Jupyter/iPython notebooks via MCP. We’d been working on a notebook for 30 min, then it would screw up the tool call, conclude the notebook had been deleted, and attempt to create it fresh. This happened repeatedly. It’s just not the kind of mistake a human would make, which gets back to, how exactly do these minds work and form models of the world?