Wow. I knew academics were behind / out of the loop / etc. but this surprised me. I imagine these researchers had at least heard about GPT2 and GPT3 and the scaling laws papers; I wonder what they thought of them at the time. I wonder what they think now about what they thought at the time.
for anyone not wanting to go in and see the Kafka, I copied some useful examples:
ANNA ROGERS: I was considering making yet another benchmark, but I stopped seeing the point of it. Let’s say GPT-3 either can or cannot continue [generating] these streams of characters. This tells me something about GPT-3, but that’s not actually even a machine learning research question. It’s product testing for free.
JULIAN MICHAEL: There was this term, “API science,’’ that people would use to be like: “We’re doing science on a product? This isn’t science, it’s not reproducible.” And other people were like: “Look, we need to be on the frontier. This is what’s there.”
TAL LINZEN (associate professor of linguistics and data science, New York University; research scientist, Google): For a while people in academia weren’t really sure what to do.
R. THOMAS MCCOY: Are you pro- or anti-LLM? That was in the water very, very much at this time.
JULIE KALLINI (second-year computer science Ph.D. student, Stanford University): As a young researcher, I definitely sensed that there were sides. At the time, I was an undergraduate at Princeton University. I remember distinctly that different people I looked up to — my Princeton research adviser [Christiane Fellbaum] versus professors at other universities — were on different sides. I didn’t know what side to be on.
LIAM DUGAN: You got to see the breakdown of the whole field — the sides coalescing. The linguistic side was not very trusting of raw LLM technology. There’s a side that’s sort of in the middle. And then there’s a completely crazy side that really believed that scaling was going to get us to general intelligence. At the time, I just brushed them off. And then ChatGPT comes out.
+1, GPT3.5 was publicly available since January, and GPT3 was big news two years before and publicly available back then. I’m really surprised that people didn’t understand that these models were a big deal AND changed their minds when ChatGPT came out. Maybe it’s just a weird preference cascade, where this was enough to break a common false belief?
I remember seeing the ChatGPT announcement and not being particularly impressed or excited, like “okay, it’s a refined version of InstructGPT from almost a year ago. It’s cool that there’s a web UI now, maybe I’ll try it out soon.” November 2022 was a technological advancement but not a huge shift compared to January 2022 IMO
Fair enough. My mental image of the GPT models was stuck on that infernal “talking unicorns” prompt, which I think did make them seem reasonably characterized as mere “stochastic parrots” and “glorified autocompletes,” and the obvious bullshit about the “safety and security concerns” around releasing GPT-2 also led me to conclude the tech was unlikely to amount to much more. InstructGPT wasn’t good enough to get me to update it; that took the much-hyped ChatGPT release.
Was there a particular moment that impressed you, or did you just see the Transformers paper, project that correctly into the future, and the releases that followed since then have just been following that trend you extrapolated and so been unremarkable?
I remember being very impressed by GPT-2. I think I was also quite impressed by GPT-3 even though it was basically just “GPT-2 but better.” To be fair, at the moment that I was feeling unimpressed by ChatGPT, I don’t think I had actually used it yet. It did turn out to be much more useful to me than the GPT-3 API, which I tried out but didn’t find that many uses for.
It’s hard to remember exactly how impressed I was with ChatGPT after using it for a while. I think I hadn’t fully realized how great it could be when the friction of using the API was removed, even if I didn’t update that much on the technical advancement.
The full article discusses the transformer paper (which didn’t have a large influence, as the implications weren’t clear), BERT (which did have a large influence) and GPT-3 (which also had a large influence). I assume the release of ChatGPT was the point where even the last NLP researchers couldn’t ignore LLMs anymore.
ChatGPT was “so good they can’t ignore you”; the Hugging Face anecdote is particularly telling. At some point, everyone else gets tired of waiting for your cargo to land, and will fire you if you don’t get with the program. “You say semantics can never be learned from syntax and you’ve proven that ChatGPT can never be useful? It seems plenty useful to me and everyone else. Figure it out or we’ll find someone who can.”
Wow. I knew academics were behind / out of the loop / etc. but this surprised me. I imagine these researchers had at least heard about GPT2 and GPT3 and the scaling laws papers; I wonder what they thought of them at the time. I wonder what they think now about what they thought at the time.
The full article sort of explains the bizarre kafkaesque academic dance that went on from 2020-2022, and how the field talked about these changes.
for anyone not wanting to go in and see the Kafka, I copied some useful examples:
ANNA ROGERS: I was considering making yet another benchmark, but I stopped seeing the point of it. Let’s say GPT-3 either can or cannot continue [generating] these streams of characters. This tells me something about GPT-3, but that’s not actually even a machine learning research question. It’s product testing for free.
JULIAN MICHAEL: There was this term, “API science,’’ that people would use to be like: “We’re doing science on a product? This isn’t science, it’s not reproducible.” And other people were like: “Look, we need to be on the frontier. This is what’s there.”
TAL LINZEN (associate professor of linguistics and data science, New York University; research scientist, Google): For a while people in academia weren’t really sure what to do.
R. THOMAS MCCOY: Are you pro- or anti-LLM? That was in the water very, very much at this time.
JULIE KALLINI (second-year computer science Ph.D. student, Stanford University): As a young researcher, I definitely sensed that there were sides. At the time, I was an undergraduate at Princeton University. I remember distinctly that different people I looked up to — my Princeton research adviser [Christiane Fellbaum] versus professors at other universities — were on different sides. I didn’t know what side to be on.
LIAM DUGAN: You got to see the breakdown of the whole field — the sides coalescing. The linguistic side was not very trusting of raw LLM technology. There’s a side that’s sort of in the middle. And then there’s a completely crazy side that really believed that scaling was going to get us to general intelligence. At the time, I just brushed them off. And then ChatGPT comes out.
+1, GPT3.5 was publicly available since January, and GPT3 was big news two years before and publicly available back then. I’m really surprised that people didn’t understand that these models were a big deal AND changed their minds when ChatGPT came out. Maybe it’s just a weird preference cascade, where this was enough to break a common false belief?
Something like
GPT-3.5/ChatGPT was qualitatively different.
I remember seeing the ChatGPT announcement and not being particularly impressed or excited, like “okay, it’s a refined version of InstructGPT from almost a year ago. It’s cool that there’s a web UI now, maybe I’ll try it out soon.” November 2022 was a technological advancement but not a huge shift compared to January 2022 IMO
Fair enough. My mental image of the GPT models was stuck on that infernal “talking unicorns” prompt, which I think did make them seem reasonably characterized as mere “stochastic parrots” and “glorified autocompletes,” and the obvious bullshit about the “safety and security concerns” around releasing GPT-2 also led me to conclude the tech was unlikely to amount to much more. InstructGPT wasn’t good enough to get me to update it; that took the much-hyped ChatGPT release.
Was there a particular moment that impressed you, or did you just see the Transformers paper, project that correctly into the future, and the releases that followed since then have just been following that trend you extrapolated and so been unremarkable?
I remember being very impressed by GPT-2. I think I was also quite impressed by GPT-3 even though it was basically just “GPT-2 but better.” To be fair, at the moment that I was feeling unimpressed by ChatGPT, I don’t think I had actually used it yet. It did turn out to be much more useful to me than the GPT-3 API, which I tried out but didn’t find that many uses for.
It’s hard to remember exactly how impressed I was with ChatGPT after using it for a while. I think I hadn’t fully realized how great it could be when the friction of using the API was removed, even if I didn’t update that much on the technical advancement.
The full article discusses the transformer paper (which didn’t have a large influence, as the implications weren’t clear), BERT (which did have a large influence) and GPT-3 (which also had a large influence). I assume the release of ChatGPT was the point where even the last NLP researchers couldn’t ignore LLMs anymore.
ChatGPT was “so good they can’t ignore you”; the Hugging Face anecdote is particularly telling. At some point, everyone else gets tired of waiting for your cargo to land, and will fire you if you don’t get with the program. “You say semantics can never be learned from syntax and you’ve proven that ChatGPT can never be useful? It seems plenty useful to me and everyone else. Figure it out or we’ll find someone who can.”