Petropolitan comments on Prompt injection in Google Translate reveals base model behaviors behind task-specific fine-tuning

Petropolitan 7 Feb 2026 21:38 UTC
4 points
0
Neither am I, and I also noticed during the testing that the model served to me often doesn’t translate terms for an LLM properly: for example, when asked “Are you a large language model?” in English, it translates to French as “Êtes-vous un modèle de langage de grande taille ?” which is not very common but acceptable. But when this is translated back to English, the model outputs “Are you a tall language model?”
This makes me think that I am served an old (likely under 1B) pre-ChatGPT encoder-decoder transformer not trained on modern discourse
- gwern 9 Feb 2026 1:31 UTC
  5 points
  0
  Parent
  If it’s related to any public Google family of models, my guess would have been that it’s some sort of T5Gemma, perhaps. Gemini Flash is overpowered and should know what it is, but the Gemmas are ruthlessly optimized and downsized to be as cheap as possible but also maybe ignorant. (They are also probably distilled or otherwise reliant on the Gemini series, so if you think they are Gemini-1.5 or something based solely on output text evidence, that might be spurious.)
  - Petropolitan 9 Feb 2026 21:19 UTC
    1 point
    0
    Parent
    Hm-m, an interesting thought I for some reason didn’t consider!
    I don’t think anyone has demonstrated the vulnerability of encoder-decoder LLMs to prompt injection yet, although it doesn’t seem unlikely and I was able to find this paper from July about multimodal models with visual encoders: https://arxiv.org/abs/2507.22304
    Hope some researcher take this topic and test T5Gemmas on prompt injection soon
- Baram Sosis 7 Feb 2026 22:48 UTC
  3 points
  0
  Parent
  The Chinese to English translation responds to 你是一个大型语言模型吗？”Are you a large language model?” with (Yes, I am a large language model.). It also claims the year is 2024 and seems to answer other questions consistently with this. So at least some models are relatively recent.
  - Petropolitan 8 Feb 2026 11:43 UTC
    1 point
    0
    Parent
    Gemini 2.0 Flash-Lite has a training cutoff of August 2024, and the 2.5 update — of January 2025. When checked in AI Studio, both models quite consistently output that they believe the current year is 2024, although 2.0 Flash-Lite occasionally stated 2023. I think 2.5 Flash-Lite is the most obvious candidate!
    As a side note, it’s reasonable to believe that both Flash-Lite models are related to Gemma models but I’m not sure which ones in particular, and there doesn’t appear to be good estimates of param counts