One type of question that would be straightforward for humans to answer, but difficult to train a machine learning model to answer reliably, would be to ask “How much money is visible in this picture?” for images like this:
If you have pictures with bills, coins, and non-money objects in random configurations—with many items overlapping and partly occluding each other—it is still fairly easy for humans to pick out what is what from the image.
But to get an AI to do this would be more difficult than a normal image classification problem where you can just fine tune a vision model with a bunch of task-relevant training cases. It would probably require multiple denomination-specific visions models working together, as well as some robust way for the model to determine where one object ends and another begins.
I would also expect such an AI to be more confounded by any adversarial factors—such as the inclusion of non-money arcade tokens or drawings of coins or colored-in circles—added to the image.
Now, maybe to solve this in under one minute some people would need to start the timer when they already have a calculator in hand (or the captcha screen would need to include an on-screen calculator). But in general, as long as there is not a huge number of coins and bills, I don’t think this type of captcha would take the average person more than say 3-4 times longer than it takes them to compete the “select all squares with traffic lights” type captchas in use now. (Though some may want to familiarize themselves with the various $1.00 and $0.50 coins that exist and some the variations of the tails sides of quarters if this becomes the new prove-you-are-a-human method.)
I presume you have in mind an experiment where (for example) you ask one large group of people “Who is Tom Cruise’s mother?” and then ask a different group of the same number of people “Mary Lee Pfeiffer’s son?” and compare how many got the right answer in the each group, correct?
(If you ask the same person both questions in a row, it seems obvious that a person who answers one question correctly would nearly always answer the other question correctly also.)