That would also fit with it having a fairly short context window, and doing the translation in chunks. Basically I’m assuming it might be quite old Google tech upgraded, rather than them slotting in a modern LLM by default the way most people building this now would. Like it might be an ugraded T5 model or something. Google Translate has been around a while, and having worked at Google some years ago, I’d give you excellent odds there was a point where it was running on something BERT-style similar to a T5 model — the question is whether they’ve started again from scratch since then, or just improved incrementally
That would also fit with it having a fairly short context window, and doing the translation in chunks. Basically I’m assuming it might be quite old Google tech upgraded, rather than them slotting in a modern LLM by default the way most people building this now would. Like it might be an ugraded T5 model or something. Google Translate has been around a while, and having worked at Google some years ago, I’d give you excellent odds there was a point where it was running on something BERT-style similar to a T5 model — the question is whether they’ve started again from scratch since then, or just improved incrementally