Mohammad Bavarian
Karma: 5
- [deleted]
Did you test Claude for it being less susceptible to this issue? Otherwise not sure where your comment actually comes from. Testing this, I saw similar or worse behavior by that model—albeit GPT4 also definitely has this issue
https://twitter.com/mobav0/status/1637349100772372480?s=20
What do you mean by Scaling Hypothesis? Do you believe extremely large transformer models trained based on autoregressive loss will have superhuman capabilities?