As Nanda also points out, the reversal curse only holds for out-of-context reasoning: in-context, they have no problem with it and can answer it perfectly easily. So, it is a false analogy here because he’s eliding the distinction between in-context and prompt-only (training). Humans do not do what he claims they do: “instantly update their world-model such that it’d be obvious to them that B is A”. At least, in terms of permanent learning rather than in-context reasoning.
For example, I can tell you that Tom Cruise’s mother is named ‘Mary Lee Pfeiffer’ (thanks to that post) but I cannot tell you who ‘Mary Lee Pfeiffer’ is out of the blue, any more than I can sing the alphabet song backwards spontaneously and fluently. But—like an LLM—I can easily do both once I read your comment and now the string “if you prompt it with who Tom Cruise’s mom is, it can then answer who Mary Lee Pfeiffer’s son is” is in my context (working/short-term memory). I expect, however, that despite my ability to do so as I write this comment, if you ask me again in a month ‘who is Mary Lee Pfeiffer?’ I will stare blankly at you and guess ‘...a character on Desperate Housewives, maybe?’
It will take several repetitions, even optimally spaced, before I have a good chance of answering ‘ah yes, she’s Tom Cruise’s mother’ without any context. Because I do not ‘instantly update my world-model such that it’d be obvious to me that [Mary Lee Pfeiffer] is [the mother of Tom Cruise]’.
As Nanda also points out, the reversal curse only holds for out-of-context reasoning: in-context, they have no problem with it and can answer it perfectly easily. So, it is a false analogy here because he’s eliding the distinction between in-context and prompt-only (training). Humans do not do what he claims they do: “instantly update their world-model such that it’d be obvious to them that B is A”. At least, in terms of permanent learning rather than in-context reasoning.
For example, I can tell you that Tom Cruise’s mother is named ‘Mary Lee Pfeiffer’ (thanks to that post) but I cannot tell you who ‘Mary Lee Pfeiffer’ is out of the blue, any more than I can sing the alphabet song backwards spontaneously and fluently. But—like an LLM—I can easily do both once I read your comment and now the string “if you prompt it with who Tom Cruise’s mom is, it can then answer who Mary Lee Pfeiffer’s son is” is in my context (working/short-term memory). I expect, however, that despite my ability to do so as I write this comment, if you ask me again in a month ‘who is Mary Lee Pfeiffer?’ I will stare blankly at you and guess ‘...a character on Desperate Housewives, maybe?’
It will take several repetitions, even optimally spaced, before I have a good chance of answering ‘ah yes, she’s Tom Cruise’s mother’ without any context. Because I do not ‘instantly update my world-model such that it’d be obvious to me that [Mary Lee Pfeiffer] is [the mother of Tom Cruise]’.