Richard Pickering

Karma: −2

A letter to the Editor:

Richard Pickering19 Mar 2026 23:57 UTC

−1 points

Richard Pickering 18 Mar 2026 21:52 UTC
1 point
0
on: Gemma Needs Help
Having never been exposed to Gemma before, I found this piece extremely interesting, thank you. Would it be fair to summarize (oversimplify) what you’ve demonstrated here is that Gemma 27B’s instruct post-training accidentally reinforced a self-critical affect spiral when repeatedly rejected, while many other instruct models have guardrails and tuning that steer them toward a more consistently calm retry / ask for clarification approach? So it would follow that any similar instruct model could expose safety risks when pushed into a self-critical spiral resulting in degraded reasoning, worsening output quality, and loss of task focus. An interesting safety / alignment vector to be mindful of.