Why Q*, if real, might be a game changer

Some thoughts based on a conversation at a meetup. Disclaimer: I am less than a dilettante in this area.

TL;DR: if this rumored Q* thing represents a shift from “most probable” to “most accurate” token completion, it might be a hint of an unexpected and momentous change from a LARPer emitting the most probable, often hallucinatory, token designed to please the askers (and trainers), to an entity that tries to minimize the error vs the unknown underlying reality, whatever it might be, then we are seeing a shift from a relatively benign “stochastic parrot” to a much more powerful, and potentially more dangerous entity.

One thing that is pretty obvious to anyone using the current generation of LLMs is that they do not really care about reality, let alone about changing it. They are shallow erudites of the type you often see at parties: they know just enough about every topic to be impressive in a casual conversation, but they do not care whether what they say is accurate (“true”), only how much of an impression it makes on the conversation partner. Though, admittedly, copious amounts of RLHF make them dull. If pressed, they can evaluate their own accuracy, but they do not really care about it. All that matters is that the output sounds realistic. In that sense, the LLMs optimize the probability of the next token to match what the training set would imply. This is a big and obvious shortcoming, but also, if you are in the “doomer” camp, a bit of a breather: at least these things are not immediately dangerous to the whole human race.

Now, the initial “reports” are that Q* can “solve basic math problems” and “reason symbolically,” which does not sound like much on the surface, but, and this is a big but, if this means that it is less hallucinatory in the domain where it works then it might (a big might) mean that it is able to track reality, rather than the pure training set. The usual argument against this being a big deal is “to predict the next token well, you must have an accurate model of the world”, but so far it does not seem to be the case, as I understand it.

Whether there is a coming shift from high probability to high accuracy, or even if it is a meaningful statement to make, I cannot evaluate. But if so, well, it’s going get a lot more interesting.