Something I notice here (about myself) is that I don’t currently understand enough about what’s going on under-the-hood to make predictions about what sort of subsystems GPT could develop internally, and what it couldn’t. (i.e. if my strength as a rationalist is the ability to be more confused by fiction than reality, well, alas)
It seems like it has to develop internal models in order to make predictions. It makes plausible sense to me that working memory is a different beast that you can’t develop by having more training data thrown at you, but I don’t really know what facts about GPT’s architecture should constrain my beliefs about that.
(It does seem fairly understandable to me that, even if it were hypothetically possible for GPT to invent working memory, it would be an inefficient way of inventing working memory)
Something I notice here (about myself) is that I don’t currently understand enough about what’s going on under-the-hood to make predictions about what sort of subsystems GPT could develop internally, and what it couldn’t. (i.e. if my strength as a rationalist is the ability to be more confused by fiction than reality, well, alas)
It seems like it has to develop internal models in order to make predictions. It makes plausible sense to me that working memory is a different beast that you can’t develop by having more training data thrown at you, but I don’t really know what facts about GPT’s architecture should constrain my beliefs about that.
(It does seem fairly understandable to me that, even if it were hypothetically possible for GPT to invent working memory, it would be an inefficient way of inventing working memory)