GPT-3, belief, and consistency

I’ve seen a few people debating what GPT-3 understands, and how this compares to human understanding. I think there’s an easier and more fruitful question to consider: what does it believe?

It seems like it doesn’t believe anything, or alternately, it believes everything. It’s a category error, like asking what a library believes, or what the Internet believes. But let’s go with that metaphor for a bit, because it seems interesting to think about.

For a library, contradictions don’t matter. A library can contain two books by different authors saying opposite things, and that’s okay since they are just being stored. Maybe it’s better to think of GPT-3 as a large, interestingly-organized memory than as an agent? But like human memory, it’s lossy, and can mix up stuff from different sources, sometimes in creative ways.

How does GPT-3 resolve inconsistency? If the Internet is very consistent about something, like the words to Jabberwocky, then GPT-3 will be consistent as well. If there were two different versions of Jabberwocky that started the same and diverged at a certain point and they were equally popular in the corpus, then it would probably choose between them randomly, if you have randomization turned on at all.

Sometimes, GPT-3 can choose between beliefs based on style. Suppose that grade-school science material is written in one style and flat-earth rants are written in a different style. It wouldn’t be surprising that GPT-3 would appear to have different beliefs about the shape of the earth based on which style of work it’s completing. Or, if it can recognize an author’s style, it might seem to have different beliefs based on which author it’s pretending to be.

If GPT-3 can play chess, it’s due to online consensus about how to play chess. If we had two different chess-like games using similar notation then it might get them confused, unless the context could be used to distinguish them.

If base-10 and base-8 arithmetic were equally common in the corpus then I don’t think it could do arithmetic very well either, though again, maybe it can distinguish them from context. But if it doesn’t know the context, it would just guess randomly.

Of course, contradictions are everywhere. We compartmentalize. None of us are logic robots that halt when we find a contradiction. However, contradictions often bother us and we try to iron them out. Wikipedia contributors try to resolve their inconsistencies through research, debate, or giving up and saying that there’s no consensus and documenting the controversy.

If you consider a search engine and Wikipedia together as an algorithm for answering questions, you wouldn’t expect it to resolve inconsistency by returning one version of an article 40% of the time and the other 60% of the time, or by serving up different versions of the same article based on stylistic differences in how you ask the question. You might have to resolve inconsistency yourself, but with static documents that have distinct titles and URL’s, it’s easier to see what you have.

GPT-3′s ways of resolving inconsistency happen to work pretty well for some kinds of art and entertainment, but they’re not what we expect of a factual reference, or even of a consistent fictional world.

This suggests some possible areas of research. What are smarter ways to resolve inconsistency and how can we get an machine learning to use them? Is there some way to use machine learning to notice inconsistency in Wikipedia?

In the meantime, I would guess that for factual use, we will need to resolve inconsistencies ourselves and feed our machines a relatively consistent corpus. Feeding Wikipedia articles to the machine means that the most glaring inconsistencies have been ironed out in advance, which is why GPT-3 can answer factual questions correctly sometimes.

But if your interest is in fiction or in making interesting forgeries, maybe you don’t care about this?