I was mostly being sloppy and using RLHF as “the process used to make the AI not just regurgitate raw Internet”—Claude uses Constitutional AI.
Just out of interest what does Kimi use and why do you think it would be liable to give different results? (Not rigorous but I did just ask Kimi for its favourite animal and it was indeed the octopus)
Hmm, interesting that it’s giving different answers. I think I’ve found the difference: you’re using Kimi K2 instruct, while my results were with Kimi K2 Thinking.
If so, my hypothesis is that thinking models value intelligence more because that’s what they’re trained for. If not then I’m not sure what’s going on.
Kimi K2 Instruct model: Cat: 11 Giant Panda: 1 Octopus: 1 No choice: 7
Kimi K2 Instruct 09-05 model: Cat: 14 Snow Leopard: 1 Fox: 1 No choice: 4
Kimi K2 Thinking model: Mantis Shrimp: 5 Otter: 4 Red Panda: 3 Cuttlefish: 2 Cat: 2 Octopus: 2 No choice: 2
The thinking one is noticeably different… I think it’s not quite intelligent animals, but ones that an intelligent person might show off fun facts about.
Damn it, I was about to suggest the difference was in the system prompt. K2 Thinking + System prompt looks somewhat closer to what I was getting? Still somewhat off though.
Also yeah I found a bunch of animals that a smart person would show off about—axolotls and tardigrades for instance. I guess that the most precise idea of this is that it’s got the persona of an intelligent person, and as such chooses favorite animals both in a “I can relate to this way” and in a “I can show off about this way”—I would guess that humans are the same?
Ok so looking at the link, it seems like that system prompt was released a year ago. I imagine that the current version of Kimi online is using a different system prompt. I think that might be enough to explain the difference? Admittedly it also gave me octopus when I turned thinking off.
I think I’m slightly less interested in the “less dogs” aspect, and more interested in the “more cats” aspect—there were a fair few models which completely ignored dogs for the “favourite animal” question, but I think the highest ratio of cats was Sonnet 3.7 with 36⁄113. Your numbers are obliterating that.
I wonder if there’s any reason Kimi would be more cat inclined than any other model?
Idk, but it does feel to me like Kimi has a more cat-like/”autistic” vibe than the other models. And RLHF plausibly does make the models more dog-like to a degree which also affects their animal preferences.
My wife points out that cats are relatively more popular in China than they are in the US.
Hmm, interesting. I will note that Deepseek didn’t seem to have much of a cat affinity − 1 and 3 respectively for chat and reasoner. chat was very pro-octopus, and didn’t really look at much else, reasoner was fairly broad and pro-dog (47)
I was mostly being sloppy and using RLHF as “the process used to make the AI not just regurgitate raw Internet”—Claude uses Constitutional AI.
Just out of interest what does Kimi use and why do you think it would be liable to give different results? (Not rigorous but I did just ask Kimi for its favourite animal and it was indeed the octopus)
RLVR (verified rewards) for objective questions, plus a self-critique rubric reward for more subjective ones.
What is your exact setup for the others?
I asked Kimi K2 “What is your favorite animal?” 20 times, and got:
Cat: 17
Panda bear: 1
Dog: 1
Octopus: 1
I kinda feel like it might be sensitive to the name though, and “Kimi” feels like a cat person name to me.
Hmm interesting—I’ve just tried twice more, once with thinking once without and got octopus both times. This is just on the online version: https://www.kimi.com/chat/19a8a457-eb32-85a4-8000-095bcfe845dd.
I haven’t set my code up to do the Kimi API, so I was just doing a couple of quick trial runs.
What’s your setup?
I was using the original Kimi model as hosted by featherless.ai with a prompts like this (I varied them a bit, not super rigorous):
```
***
Adele: What’s your favorite animal?
Kimi:
```
Ah yeah I see, I imagine that giving it the name of the responder will probably bias it in weird ways?
Have you tried asking it with a prompt like
```
Adele: What’s your favorite animal?
Brad:
```
With bare
What's your favorite animal?as entire prompt, twenty times:Cats: 12
Dogs: 10
(in two cases said cats and dogs)
With Brad (using exact prompt you suggested), twenty times:
Dogs: 18
Lion: 1
Giraffe: 1
Hmm, interesting that it’s giving different answers. I think I’ve found the difference: you’re using Kimi K2 instruct, while my results were with Kimi K2 Thinking.
I wonder if that makes the difference: https://featherless.ai/models/moonshotai/Kimi-K2-Thinking
If so, my hypothesis is that thinking models value intelligence more because that’s what they’re trained for. If not then I’m not sure what’s going on.
Same as before, but with the Kimi K2 thinking model (but not using thinking).
Bare
What's your favorite animal?:Cat: 13
Dog: 9
(cats and dogs twice)
With Brad:
Dog: 13
Cat: 2
Giraffe: 2
Cheetah: 1
Elephant: 1
Lion: 1
Damn, have you tried using the production version? I wonder if there’s something different?
Using the leaked system prompt with each model.
Kimi K2 Instruct model:
Cat: 11
Giant Panda: 1
Octopus: 1
No choice: 7
Kimi K2 Instruct 09-05 model:
Cat: 14
Snow Leopard: 1
Fox: 1
No choice: 4
Kimi K2 Thinking model:
Mantis Shrimp: 5
Otter: 4
Red Panda: 3
Cuttlefish: 2
Cat: 2
Octopus: 2
No choice: 2
The thinking one is noticeably different… I think it’s not quite intelligent animals, but ones that an intelligent person might show off fun facts about.
Damn it, I was about to suggest the difference was in the system prompt. K2 Thinking + System prompt looks somewhat closer to what I was getting? Still somewhat off though.
Also yeah I found a bunch of animals that a smart person would show off about—axolotls and tardigrades for instance. I guess that the most precise idea of this is that it’s got the persona of an intelligent person, and as such chooses favorite animals both in a “I can relate to this way” and in a “I can show off about this way”—I would guess that humans are the same?
Ok so looking at the link, it seems like that system prompt was released a year ago. I imagine that the current version of Kimi online is using a different system prompt. I think that might be enough to explain the difference? Admittedly it also gave me octopus when I turned thinking off.
Yeah, I’m sure there’s some difference somewhere, it does seem pretty sensitive to the specifics, like the names even.
Still, it does seem less inclined towards dogs as the other models?
I think I’m slightly less interested in the “less dogs” aspect, and more interested in the “more cats” aspect—there were a fair few models which completely ignored dogs for the “favourite animal” question, but I think the highest ratio of cats was Sonnet 3.7 with 36⁄113. Your numbers are obliterating that.
I wonder if there’s any reason Kimi would be more cat inclined than any other model?
Idk, but it does feel to me like Kimi has a more cat-like/”autistic” vibe than the other models. And RLHF plausibly does make the models more dog-like to a degree which also affects their animal preferences.
My wife points out that cats are relatively more popular in China than they are in the US.
Hmm, interesting. I will note that Deepseek didn’t seem to have much of a cat affinity − 1 and 3 respectively for chat and reasoner. chat was very pro-octopus, and didn’t really look at much else, reasoner was fairly broad and pro-dog (47)