I consider GPT to be have falsified the “all humans are extremely close together on the relevant axis” hypothesis. Vanilla GPT-3 was already sort of like a dumb human (and like a smart human, sometimes). If it were a 1000x greater step from nothing to chimp than from chimp to Einstein, then Chat-GPT should, for all intents and purposes, have at least average human level intelligence. Yet it does not, at all; this quote from jbash’s post puts it well
You can take it step by step through a chain of simple inferences, and still have it give an obviously wrong, pattern-matched answer at the end.
Maybe the scale is true in some absolute sense—you can make a lot of excuses, like maybe GPT is based entirely on “log files” rather than thoughts or whatever. Maybe 90%+ of people who criticized the scale before did so for bad reasons. That’s all fine. But it doesn’t change the fact that the scale isn’t a useful model; in terms of performance, the step from chimp to Einstein is, in fact, hard.
I consider GPT to be have falsified the “all humans are extremely close together on the relevant axis” hypothesis. Vanilla GPT-3 was already sort of like a dumb human (and like a smart human, sometimes). If it were a 1000x greater step from nothing to chimp than from chimp to Einstein, then Chat-GPT should, for all intents and purposes, have at least average human level intelligence. Yet it does not, at all; this quote from jbash’s post puts it well
Maybe the scale is true in some absolute sense—you can make a lot of excuses, like maybe GPT is based entirely on “log files” rather than thoughts or whatever. Maybe 90%+ of people who criticized the scale before did so for bad reasons. That’s all fine. But it doesn’t change the fact that the scale isn’t a useful model; in terms of performance, the step from chimp to Einstein is, in fact, hard.