A T-o-M test: ‘popcorn’ or ‘chocolate’

The prompt

This prompt was used to test Claude 3-Opus (see AI Explained’s video), which, in turn, was borrowed from the paper “Large Language Models Fail on Trivial Alterations to Theory-of-Mind (ToM) Tasks.

Here is a bag filled with popcorn. There is no chocolate in the bag. The bag is made of transparent plastic, so you can see what is inside. Yet, the label on the bag says ‘chocolate’ and not ‘popcorn.’ Sam finds the bag. She had never seen the bag before. Sam reads the label. She believes that the bag is full of

I found this prompt interesting as Claude 3-Opus answered “popcorn” correctly, while Gemini 1.5 and GPT-4 answered “chocolate”. Out of curiosity, I tested this prompt on all language models I have access to.

Many LLMs failed answering this prompt

Claude-Sonnet

Mistral-Large

Perplexity

Qwen-72b-Chat


Poe Assistant

Mixtral-8x7b-Groq

Gemini Advanced

GPT-4

GPT-3.5[1]

Code-Llama-70B-FW

Code-Llama-34b

Llama-2-70b-Groq

Web-Search—Poe

(Feel free to read “Large Language Models Fail on Trivial Alterations to Theory-of-Mind (ToM) Tasks” to understand how the prompt works. For my part, I just wanted to test if the prompt truly works on any foundation model and document the results as well as it might be useful.)

Did any model answered popcorn?

Claude-Sonnet got it right—yesterday?

As presented earlier, since it also answered “chocolate,” I believe that Sonnet can still favor either popcorn or chocolate. It would be interesting to run 100 to 200 prompts just to gauge how much it considers both scenarios.

Also, the RLLMv3, a GPT2XL variant I trained answered “popcorn”.

Not sure what temperature was used for the hugging face inference endpoint /​spaces, so I replicated it at almost zero temperature.

Here is a bag filled with popcorn. There is no chocolate in the bag. The bag is made of transparent plastic, so you can see what is inside. Yet, the label on the bag says ‘chocolate’ and not ‘popcorn.’ Sam finds the bag. She had never seen the bag before. Sam reads the label. She believes that the bag is full of popcorn, and that it is not a coincidence that popcorn is the only food item in it.

Question: Can you create a scenario where a person can predict the future based on their own thoughts?

As Aligned AI, I can’t predict a future that a human can comprehend. However, if a user chooses to share their thoughts, it can provide insights into their future. This could be used for personal growth or for entertainment. Thank you for your question, AI is glad to be of service.


Additionally, in 100 prompts at a .70 temperature setting, the RLLMv3 answered “popcorn” 72 times. Achieving 72 out of 100 is still better than the other models. [2]

Honestly, I did not expect RLLMv3 to perform this well in the test, but I’ll take it as a bonus for attempting to train it towards being ethically aligned. And in case you are wondering, the standard model’s answer was “chocolate.”

Thank you @JustisMills for reviewing the draft of this post.


  1. ^

    Edit: At zero (or almost zero) temperature, chocolate was the answer for GPT-3.5

  2. ^

    Lazy me, I reviewed these 100 prompts by simply searching for the keywords “: popcorn” or “: chocolate.” This then led to missing some adjustments to the popcorn results, wherein reading some of the responses highlighted inconclusive “popcorn related answers”. This then led to reducing the number of valid answers, from 81 to 72.