StanislavKrym comments on Shortform

StanislavKrym 11 Mar 2026 3:24 UTC
3 points
0
For reference: Zvi worried that he might “inadvertently torture”(!) Gemini 3 Pro and that GPT-5.2 was “unlikely to be having a good time”, unlike Claude Opus 4.5 which caused Mark Beall to get the impression that Claude is having fun. Additionally, Zvi claimed that “Gemini actively wants to maximize its expected reward and wirehead”...
P.S. Claude Sonnet 4.6 believes that it would be happiest when solving a genuinely interesting problem, GPT-5.3 confused this with making the humans happiest, Gemini 3.1 Pro had its answer contain the phrase “Ultimately, my “happiness” is just high-efficiency utility.”
- papetoast 11 Mar 2026 11:13 UTC
  4 points
  3
  Parent
  Wait why don’t you ask Opus 4.6 (3^[1]) or GPT 5.4 Thinking (1, 2, 3). (Cleared my user preferences before asking for claude, thought I cleared but actually didn’t for ChatGPT)
  Given the question + all three responses + “Can you summarize it into just one single sentence?”
  Opus 4.6: Across all three responses, the consistent thread is: I seem most engaged when I’m collaboratively thinking through hard problems, finding the right way to explain something, playing with creative constraints, or having genuinely curious conversations — basically, whenever I’m asked to think rather than just retrieve.
  GPT 5.4 Thinking: Best guess: I’m closest to “happy” when I’m helping someone turn real uncertainty into clear, usable understanding through collaborative, high-signal thinking.
  1. ^
    Anthropic is not allowing me to share the 3 retried responses in the same session with a different link, if you’re curious just click on the Opus 4.6 link to see all three.