Perhaps it would be helpful to explain my model a bit more.
(1) I think that if you show statements like “Olaf Scholz was the ninth Chancellor of Germany” or “Uriah Hawthorne is the composer of Abyssal Melodies” to typical humans, then the humans are very likely to consider the reversed statements equally valid, and the humans are very likely to be correct.
(2) Thus I conclude that it would be desirable for an LLM to make similar reversals, and that a sufficiently-good LLM would very likely succeed at this. If current LLMs can’t do this, then I agree this is some sort of failure on their part.
(3) However, I do not think that the mechanism being used by the humans to perform such reversals is to match them to the general pattern “A is B” and then reverse that pattern to yield “B is A”, nor do I believe such a general mechanism can match the humans’ accuracy.
I think the humans are probably matching to some patterns of far greater specificity, perhaps along the lines of:
(person-name) is (monarch-title) of (group)
(person-name) is (creator-title) of (created thing)
That is, I suspect it requires knowing roughly what a Chancellor or composer is, and probably also knowing at least a little bit about how people or things are commonly named. (If someone says “mighty is the king of the elves”, and then asks “who is the king of the elves?” you probably shouldn’t answer “mighty.”)
I am skeptical that the two examples from (1) are even being matched to the same pattern as each other. I suspect humans have thousands of different patterns to cover various different special cases of what this paper treats as a single phenomenon.
(4) I hadn’t considered this specific issue prior to encountering this post, but I think if you’d asked me to guess whether LLMs could do these sorts of reversals, I’d probably have guessed they could. So in that sense I am surprised.
(5) But I predict that if LLMs could do this, it would only be by learning a lot of specific information about things like chancellors and composers. If LLMs fail at this, I don’t expect that failure has anything to do with basic logic, but rather with detailed domain knowledge.
It’s nice to think about this paper as a capability request. It would be nice to have language models seamlessly run with semantic triples from wikidata, only seen once, and learn bidirectional relations.
Perhaps it would be helpful to explain my model a bit more.
(1) I think that if you show statements like “Olaf Scholz was the ninth Chancellor of Germany” or “Uriah Hawthorne is the composer of Abyssal Melodies” to typical humans, then the humans are very likely to consider the reversed statements equally valid, and the humans are very likely to be correct.
(2) Thus I conclude that it would be desirable for an LLM to make similar reversals, and that a sufficiently-good LLM would very likely succeed at this. If current LLMs can’t do this, then I agree this is some sort of failure on their part.
(3) However, I do not think that the mechanism being used by the humans to perform such reversals is to match them to the general pattern “A is B” and then reverse that pattern to yield “B is A”, nor do I believe such a general mechanism can match the humans’ accuracy.
I think the humans are probably matching to some patterns of far greater specificity, perhaps along the lines of:
(person-name) is (monarch-title) of (group)
(person-name) is (creator-title) of (created thing)
That is, I suspect it requires knowing roughly what a Chancellor or composer is, and probably also knowing at least a little bit about how people or things are commonly named. (If someone says “mighty is the king of the elves”, and then asks “who is the king of the elves?” you probably shouldn’t answer “mighty.”)
I am skeptical that the two examples from (1) are even being matched to the same pattern as each other. I suspect humans have thousands of different patterns to cover various different special cases of what this paper treats as a single phenomenon.
(4) I hadn’t considered this specific issue prior to encountering this post, but I think if you’d asked me to guess whether LLMs could do these sorts of reversals, I’d probably have guessed they could. So in that sense I am surprised.
(5) But I predict that if LLMs could do this, it would only be by learning a lot of specific information about things like chancellors and composers. If LLMs fail at this, I don’t expect that failure has anything to do with basic logic, but rather with detailed domain knowledge.
It’s nice to think about this paper as a capability request. It would be nice to have language models seamlessly run with semantic triples from wikidata, only seen once, and learn bidirectional relations.