I don’t think I am attacking a straw man: You don’t believe GPT-2 can abstract reading into concepts, and I was trying to convince you that it can. I agree that current versions can’t communicate ideas too complex to be expressed in a single paragraph. I think it can form original concepts, in the sense that 3-year old children can form original concepts. They’re not very insightful or complex concepts, and they are formed by remixing, but they are concepts.
Ok I think we are talking past each other, hence the accusation of a straw man. When you say “concepts” you are referring to the predictive models, both learned knowledge and dynamic state, which DOES exist inside an instance of GPT-2. This dynamic state is initialized with the input, at which point it encodes, to some degree, the content of the input. You are calling this “understanding.”
However when I say “concept modeling” I mean the ability to reason about this at a meta-level. To be able to not just *have* a belief which is useful in predicting the next token in a sequence, but to understand *why* you have that belief, and use that knowledge to inform your actions. These are ‘lifted’ beliefs, in the terminology of type theory, or quotations in functional programming. So to equate belief (predictive capability) and belief-about-belief (understanding of predictive capability) is a type error from my perspective, and does not compute.
GPT-2 has predictive capabilities. It does not instantiate a conceptual understanding of its predictive capabilities. It has no self-awareness, which I see as a prerequisite for “understanding.”
I don’t think I am attacking a straw man: You don’t believe GPT-2 can abstract reading into concepts, and I was trying to convince you that it can. I agree that current versions can’t communicate ideas too complex to be expressed in a single paragraph. I think it can form original concepts, in the sense that 3-year old children can form original concepts. They’re not very insightful or complex concepts, and they are formed by remixing, but they are concepts.
Ok I think we are talking past each other, hence the accusation of a straw man. When you say “concepts” you are referring to the predictive models, both learned knowledge and dynamic state, which DOES exist inside an instance of GPT-2. This dynamic state is initialized with the input, at which point it encodes, to some degree, the content of the input. You are calling this “understanding.”
However when I say “concept modeling” I mean the ability to reason about this at a meta-level. To be able to not just *have* a belief which is useful in predicting the next token in a sequence, but to understand *why* you have that belief, and use that knowledge to inform your actions. These are ‘lifted’ beliefs, in the terminology of type theory, or quotations in functional programming. So to equate belief (predictive capability) and belief-about-belief (understanding of predictive capability) is a type error from my perspective, and does not compute.
GPT-2 has predictive capabilities. It does not instantiate a conceptual understanding of its predictive capabilities. It has no self-awareness, which I see as a prerequisite for “understanding.”
Yeah, you’re right. It seems like we both have a similar picture of what GPT-2 can and can’t do, and are just using the word “understand” differently.