Zac Hatfield-Dodds comments on What’s up with the bad Meta projects?

Zac Hatfield-Dodds 18 Aug 2022 8:29 UTC
12 points
4
I’m not sure what you mean by “running on fewer resources”—it’s a 175 billion parameter model, so it has the same minimum hardware requirements as GPT-3.

Their inference setup likely uses the same hardware per-response, but I’d guess OpenAI has much faster kernels so Meta would need more hardware to serve a given traffic rate. However, that’s totally unrelated to the quality of each response.

Blenderbot is based on their largest “OPT” model, presumably fine-tuned somehow, after a training run which was explicitly imitating GPT-3 after they couldn’t work out better hyperparameters, though I’d add it’s pretty badly under-trained even according to pre-chinchilla scaling.

(all views my own, you know the drill)
- ChristianKl 18 Aug 2022 9:55 UTC
  3 points
  1
  Parent
  It seems you are right. I thought previously that BlenderBot was supposed to be useable outside of the research enviroment.
  I read through the description and a key difference seems to be that Blenderbot actually has long-term memory about the user with whom it interacts.
  - Yitz 18 Aug 2022 14:58 UTC
    5 points
    4
    Parent
    Except that it doesn’t really! The interpretations it stores in “memory” (which is just a small notepad) are so out-of-sync with what went on in the conversation previously that I think it may be doing worse with that memory than without it. As an anecdotal observation: I asked what it thought of Eliezer Yudkowsky’s take on AI ethics, and it responded asking me if I knew that [insert first sentence of wiki page about Yudkowsky here], and stored the “memory” that I enjoyed Harry Potter (presumably because of Yud’s fanfic?). It then proceeded to tell me that God is real and I should believe in Him, which assuming that Yudkowsky is not in fact God, had nothing to do with anything previously discussed. It definitely is not usable outside of a research environment, which Meta should have known, but nonetheless their PR pitch has been promoting it as giving general consumers free access to a cutting-edge conversational chatbot (which it really really isn’t).
    - ChristianKl 19 Aug 2022 9:53 UTC
      2 points
      0
      Parent
      That sounds like it performing worse than because it tries to do the thing with memory. That sounds like an okay tradeoff if you want to have long-term success with having memory.
      It definitely is not usable outside of a research environment, which Meta should have known, but nonetheless their PR pitch has been promoting it as giving general consumers free access to a cutting-edge conversational chatbot
      I expect that you are wrong about that and that there are people who have fun chatting with it. Even if it’s a MVP that’s not very powerful there’s still a good chance that some people will find it useable.
      To improve the product it’s important to have people using it and providing more data, so it makes sense to announce it in a way that gets more people to use it.