If it’s related to any public Google family of models, my guess would have been that it’s some sort of T5Gemma, perhaps. Gemini Flash is overpowered and should know what it is, but the Gemmas are ruthlessly optimized and downsized to be as cheap as possible but also maybe ignorant. (They are also probably distilled or otherwise reliant on the Gemini series, so if you think they are Gemini-1.5 or something based solely on output text evidence, that might be spurious.)
Hm-m, an interesting thought I for some reason didn’t consider!
I don’t think anyone has demonstrated the vulnerability of encoder-decoder LLMs to prompt injection yet, although it doesn’t seem unlikely and I was able to find this paper from July about multimodal models with visual encoders: https://arxiv.org/abs/2507.22304
Hope some researcher take this topic and test T5Gemmas on prompt injection soon
If it’s related to any public Google family of models, my guess would have been that it’s some sort of T5Gemma, perhaps. Gemini Flash is overpowered and should know what it is, but the Gemmas are ruthlessly optimized and downsized to be as cheap as possible but also maybe ignorant. (They are also probably distilled or otherwise reliant on the Gemini series, so if you think they are Gemini-1.5 or something based solely on output text evidence, that might be spurious.)
Hm-m, an interesting thought I for some reason didn’t consider!
I don’t think anyone has demonstrated the vulnerability of encoder-decoder LLMs to prompt injection yet, although it doesn’t seem unlikely and I was able to find this paper from July about multimodal models with visual encoders: https://arxiv.org/abs/2507.22304
Hope some researcher take this topic and test T5Gemmas on prompt injection soon