Everyone thinks GPT stands for “Generative Pre-trained Transformer”. (For example, Wikipedia.) Does it really? The earliest mention of “GPT” is in the GPT-2 paper, which refers to “the OpenAI GPT model” and cites the GPT-1 paper. That paper does not contain the phrase “generative pre-trained transformer”. But it does contain the phrase “generative pre-training”, in the title and in the body, italicized.
It’s a common pattern that early usages of “GPT” reference BERT. I have not seen any counterexamples yet. e.g. Tweet (2018-10-12), Tweet (2018-10-12), Tweet (2018-10-13), Chinese blog post (2018-10-14), GitHub issue (2018-10-24), paper (2018-11-02), paper (2018-11-02), paper (2018-11-25), GitHub issue (2018-11-26).
This is the core of the dispute between the USPTO and OpenAI over their (failed) attempt to trademark the term in the US, so citing their papers doesn’t help resolve this.
Really? My read was that the USPTO claimed GPT stands for “generative pre-trained transformer”, and OpenAI has neither confirmed nor disputed that, merely arguing that most consumers don’t know that.
GPT = Generative Pre-Training?
Everyone thinks GPT stands for “Generative Pre-trained Transformer”. (For example, Wikipedia.) Does it really? The earliest mention of “GPT” is in the GPT-2 paper, which refers to “the OpenAI GPT model” and cites the GPT-1 paper. That paper does not contain the phrase “generative pre-trained transformer”. But it does contain the phrase “generative pre-training”, in the title and in the body, italicized.
The earliest mention of “OpenAI GPT” I found[1] is in the BERT paper by Google (
2018-10-11), which states:It’s a common pattern that early usages of “GPT” reference BERT. I have not seen any counterexamples yet.
e.g. Tweet (
2018-10-12), Tweet (2018-10-12), Tweet (2018-10-13), Chinese blog post (2018-10-14), GitHub issue (2018-10-24), paper (2018-11-02), paper (2018-11-02), paper (2018-11-25), GitHub issue (2018-11-26).The GPT-2 paper (
2019-02-14) later also cites BERT.My search strategies included: Searching Google, Twitter, first 30 PDFs from Cited By via Google Scholar, Semantic Scholar citations, alphaXiv Assistant, GPT-5.2-Thinking, GitHub, Grok 4.1, Claude Opus 4.5 Researcher, Manus 1.6 Lite, …
If this was basically an oversight that went viral to ultimately billions of people that’s hilarious.
This is the core of the dispute between the USPTO and OpenAI over their (failed) attempt to trademark the term in the US, so citing their papers doesn’t help resolve this.
Really? My read was that the USPTO claimed GPT stands for “generative pre-trained transformer”, and OpenAI has neither confirmed nor disputed that, merely arguing that most consumers don’t know that.
These are the circumstances in which one engages in kettle logic, so I wouldn’t read too much into any of their arguments.