My hot take:
Not too surprising to me, considering what GPT-3 could do. However there were some people (and some small probability mass remaining in myself) saying that even GPT-3 wasn’t doing any sort of reasoning, didn’t have any sort of substantial understanding of the world, etc. Well, this is another nail in the coffin of that idea, in my opinion. Whatever this architecture is doing on the inside, it seems to be pretty capable and general.
I don’t think this architecture will scale to AGI by itself. But the dramatic success of this architecture is evidence that there are other architectures, not too far away in search space, that exhibit similar computational efficiency and scales-with-more-compute properties, that are useful for more different kinds of tasks.