The thread that comment came from was contentious, I got a lot of pushback here and elsewhere during the early GPT days for my opinion that transformers would be able to output interesting math.
Two years later when 3.5 was out, I felt that my ‘interesting’ threshold had been crossed and I had been technically correct, but was still hearing the same arguments. I’m happy that six years on, we have proof that my assessment of the potential of transformers, which to be clear, was absolutely viewed as ‘evidence that this person is crazy in a way that makes me want to avoid him’, was close to accurate.
From a meta perspective, this post is probably not helping me appear sane.
The thread that comment came from was contentious, I got a lot of pushback here and elsewhere during the early GPT days for my opinion that transformers would be able to output interesting math.
Two years later when 3.5 was out, I felt that my ‘interesting’ threshold had been crossed and I had been technically correct, but was still hearing the same arguments. I’m happy that six years on, we have proof that my assessment of the potential of transformers, which to be clear, was absolutely viewed as ‘evidence that this person is crazy in a way that makes me want to avoid him’, was close to accurate.
From a meta perspective, this post is probably not helping me appear sane.
It shows both that you were right qualitatively and wrong about the numbers.