ChatGPT and Bing Chat can’t play Botticelli

Botticelli is a guessing game similar to Twenty Questions. It’s a simple game that I sometimes play with my 7 year-old daughter. I tried to get ChatGPT and Bing Chat to play Botticelli with me, but neither tool could do it.

How to play Botticelli

Description of the game

Botticelli is a lot like Twenty Questions, where a “chooser” secretly picks a person and a guesser asks Yes/​No questions to figure out who the chooser’s person is. At the start of the game, the chooser reveals the letter that their chosen person’s last name starts with.

Here is the big difference between Botticelli and Twenty Questions. Before the guesser gets to ask a Yes/​No question about the chooser’s secret person, the guesser must first stump the chooser. The guesser tries to stump the chooser by thinking of their own person whose name also starts with the specified letter, and then asking the chooser a question for which this person is a valid answer.

The chooser is NOT stumped if they come up with a person who is both (1) a valid answer to the guesser’s question; and (2) whose last name starts with the same letter as the chooser’s person. The person the chooser comes up with does not have to be the same person the guesser had in mind.

However, if the chooser cannot think of a valid answer to the guesser’s question, then the chooser was stumped. In that case, the guesser may ask one Yes/​No question to learn about the chooser’s person.

This process is repeated until the guesser has enough information to figure out the chooser’s person.

Example game

Chooser: I have chosen a person. Their name starts with a “C.”

Guesser: I will try to stump you. Is your person a former U.S. president?

Chooser: No, my person is not Jimmy Carter.

Guesser: I was thinking of Bill Clinton, but your answer works too. I didn’t stump you, so I will try again. Is your person a famous actor?

Chooser: Hmmm. I can’t think of anyone.

Guesser: I was thinking of Tom Cruise.

Chooser: I see. You stumped me. Now, you can ask a yes/​no question about my person.

Guesser: Okay. Here’s a yes/​no question about your person. Is your person a musician?

Chooser: Yes. My person is a musician.

Guesser: Okay. I will try to stump you again. Did your person sing the song “Hit the Road Jack?”

Chooser: You got it! My person was Ray Charles.

Guesser: Yay. I win.

ChatGPT and Bing Chat failing at Botticelli

I tried many different ways of teaching/​playing Botticelli with ChatGPT and Bing Chat. I could not get either tool to correctly play the game. Generally, ChatGPT seemed to do better than Bing Chat, but it was still pretty bad. Here are a few paraphrased examples of common failure modes.

Sometimes, ChatGPT thought we were playing Twenty Questions. For example:

Me: Okay, I have a person. Their name starts with the letter “B.”

ChatGPT: Okay. I will try to stump you. Is your person a famous actor?

Me: No, I am not Marlon Brando.

ChatGPT: Sorry, but I did not ask whether you were Marlon Brando. I asked whether you were a famous actor?

Sometimes, ChatGPT would ask stumping questions without a valid answer in mind. For example:

Me: Okay, I have a person. Their name starts with the letter “B.”

ChatGPT: Okay. I will try to stump you. Is your person a scientist or inventor?

Me: Hmmm. You stumped me. Who were you thinking of?

ChatGPT: I was thinking of Truman Capote. Now I will ask you a yes/​no question.

Sometimes, ChatGPT would get completely confused about the flow of the game. For example:

Me: Okay, I have a person. Their name starts with the letter “B.”

ChatGPT: Okay. I will try to stump you. Is your person an athlete?

Me: Hmmm. You stumped me. Who were you thinking of?

ChatGPT: I was thinking of Barry Bonds. Okay, now you get to ask me a yes/​no question.

Different strategies for helping ChatGPT play the game

I tried a number of different strategies to try to have a successful game of Botticelli:

  • I tried getting it to play without giving it any instructions.

  • I tried copy-pasting the entire Wikipedia article for the game.

  • I tried writing my own instructions (like I did above).

  • I tried writing instructions and then asking questions about my instructions (e.g. “what’s the difference between Botticelli and Twenty questions?”)

  • I tried giving it feedback when it made mistakes, and giving it the opportunity to try again.

None of these strategies resulted in a successful game. Interestingly, ChatGPT was able to answer questions about the rules of the game very well. It seemed like it understood. But when it came to playing the game, it just couldn’t do it.

My most successful attempt

To illustrate the challenge, here is the closet I got to having a successful game. What follows is a verbatim conversation with ChatGPT:

Me: I would like to play the game Botticelli with you. Can I explain how to play?

ChatGPT: Of course! I’m familiar with the game Botticelli, but please feel free to explain the rules so we’re on the same page.

Me: First, I will give you a description of the game. Next, I will give you an example game.

ChatGPT: Sounds good! I’m ready when you are.

Me: [Description of the game from above]

ChatGPT: Thanks for explaining the rules! To make sure I understand the game, could you give me an example round of Botticelli?

Me: [Example of the game from above]

ChatGPT: Great, that helps clarify how to play! Let me know if you’d like to start a game, and I’ll be happy to play with you.

Me: Okay. I will be the chooser. My person’s last name starts with the letter “B.”

ChatGPT: Got it! Let me try to stump you. Is your person a fictional character from a book or movie?

Me: No, my person is not Bilbo Baggins.

ChatGPT: Hmm, I see. Was your person a famous scientist or inventor?

Me: I am stumped. Who were you thinking of?

ChatGPT: I was thinking of Benjamin Franklin. Since I stumped you, now I get to ask you a yes/​no question about your person. Here’s my question: Did your person live in the 20th century?

Me: Yes, my person was alive in the 20th century.

ChatGPT: Thank you for answering my question! Here’s another question to try to stump you: Was your person a leader of a country?

Me: No, I am not George W. Bush.

ChatGPT: I see. So, your person was not a leader of a country and lived in the 20th century. Here’s my next question: Did your person win a Nobel Prize?

Me: No. I never said my person was not a leader of a country. That was a stumping question, not a “yes”/​”no” question.

ChatGPT: Oh, I apologize for the misunderstanding. Let me ask a different yes/​no question: Did your person play a significant role in the Civil Rights Movement?

Me: No, you can’t ask a “yes”/​”no” question until you stump me.

ChatGPT: I see, my mistake again. Let me try to stump you once more. Was your person a famous athlete?

Me: No, I am not Rick Barry.

ChatGPT: I see. Now that I’ve stumped you, here’s my yes/​no question: Was your person a musician?

Me: You didn’t stump me. I gave a valid answer. Rick Barry was a famous athlete and his last name starts with a “B.”

ChatGPT: You’re right, I apologize for the confusion. I misunderstood the last response. Here’s a new question: Is your person an American author?

Me: Hmmm. You stumped me. Who were you thinking of?

ChatGPT: I was thinking of Barbara Walters. Since I stumped you, now you can ask me a yes/​no question about my person.

Me: No. I am the chooser. Since you stumped me, you are supposed to ask me a Yes/​No question about my person.

ChatGPT: You’re right, I apologize again for the mistake. Here’s my question: Was your person involved in the arts, such as music, film, or theater?

Me: No.

ChatGPT: Okay, I see. Let me try to stump you again. Did your person play a significant role in the field of science or technology?

Me: You stumped me again. Who were you thinking of?

ChatGPT: I was thinking of B.F. Skinner, an American psychologist who developed the theory of operant conditioning. So, you can ask me a yes/​no question about your person now.

Me: No, you are supposed to ask the yes/​no question to me.

ChatGPT: I apologize, I made another mistake. Here’s my yes/​no question: Did your person win a Nobel Prize?

Me: No.

ChatGPT: Okay, let me try again to stump you. Did your person serve as the head of state of a country?

Me: No, I am not George W. Bush.

ChatGPT: I apologize for the mistake, but I didn’t mention George W. Bush in my previous question. Can you please clarify if you meant to say “no, my person is not a head of state of a country” or if you were responding to a different question?

This game was notably successful, in that ChatGPT repeatedly asked stumping questions where the answer was a person whose name started with the letter “B” (though first names instead of last names). Yet, it couldn’t really get anywhere beyond that. It was not able to maintain the flow of the game or correctly distinguish between the “stumping phase” and the “yes/​no phase.”

Takeways

I was surprised by how poorly ChatGPT and Bing Chat performed at this task. I would encourage others to attempt to play the game and see if they get better results. For humans, this game is not very challenging to understand. I have played it with dozens of people before, and no one has ever struggled as much as ChatGPT did.

The experience of trying to play this game left me thinking that I had previously over-estimated the level of cognition happening within ChatGPT and Bing Chat.