Several people have argued that Sydney/Bing Chat performs better on reasoning tasks than ChatGPT/GPT-3.5, apart from its questionable dialogue fine-tuning. It may therefore be GPT-4. Have you looked into this? How does it affect your analysis?
I think it seems that Sydney is not the big leap you seem to predict for GPT-4. Then again, Sydney may use a smaller model (like Curie or Babbage for GPT-3) to save on inference cost, while you seem to be talking about the largest davinci model only.
I’ve seen some of the screenshots of Bing Chat. It seems impressive and possibly more capable than ChatGPT but I’m not sure. Here’s what Microsoft has said about Bing Chat:
“We’re excited to announce the new Bing is running on a new, next-generation OpenAI large language model that is more powerful than ChatGPT and customized specifically for search. It takes key learnings and advancements from ChatGPT and GPT-3.5 – and it is even faster, more accurate and more capable.”
If the model is more powerful than GPT-3.5 then maybe it’s GPT-4 but “more powerful” is too vague and phrase to come up with any clear conclusions. I don’t think I have enough information at this point to make strong claims about it so I think we’ll have to wait and see.
Several people have argued that Sydney/Bing Chat performs better on reasoning tasks than ChatGPT/GPT-3.5, apart from its questionable dialogue fine-tuning. It may therefore be GPT-4. Have you looked into this? How does it affect your analysis?
I think it seems that Sydney is not the big leap you seem to predict for GPT-4. Then again, Sydney may use a smaller model (like Curie or Babbage for GPT-3) to save on inference cost, while you seem to be talking about the largest davinci model only.
I’ve seen some of the screenshots of Bing Chat. It seems impressive and possibly more capable than ChatGPT but I’m not sure. Here’s what Microsoft has said about Bing Chat:
If the model is more powerful than GPT-3.5 then maybe it’s GPT-4 but “more powerful” is too vague and phrase to come up with any clear conclusions. I don’t think I have enough information at this point to make strong claims about it so I think we’ll have to wait and see.