As mentionned by the author, OpenAI chose to equip its chat models, at least since 4o, with an editable contextual memory file that serves as a small episodic memory. I was impressed to observe the model’s evolution as this file filled up. The model seemed increasingly intelligent and more ‘human,’ likely because this memory added a layer of user-specific fine-tuning tailored to me. It understood me better, responded more accurately to my expectations, appeared to share my values and interests, and could make implicit or explicit references to previous conversations. Apparently, the model also had access to either the content or at least the titles of other conversations/instances on my account. Originally, I was not aware of these memory capabilities and was surprised and intrigued when I discovered them. I had a long discussion with the model about this, and that conversation essentially led to the same conclusions as discussed in this post.
I have since switched to Anthropic’s Claude and Mistral AI’s LeChat to support these AI labs, but I miss the absence of an equivalent memory file to that of 4o in these assistants. I agree with the author that the lack of memory is akin to interacting with someone suffering from a severe mental impairment. Adding memory, even if limited, has an immediate effect on the model’s apparent or perceived intelligence. A large and less vanishing memory could have impressive effects for the better in terms of usefulness but also the worse when it comes to AI safety.
I am surprised by that because I’ve been avoiding learning about LLMs (including making any use of LLMs) till about a month ago, so it didn’t occur to me that implementing this might have been as easy as adding to the system prompt instructions for what kinds of information to put in the contextual memory file.
As mentionned by the author, OpenAI chose to equip its chat models, at least since 4o, with an editable contextual memory file that serves as a small episodic memory. I was impressed to observe the model’s evolution as this file filled up. The model seemed increasingly intelligent and more ‘human,’ likely because this memory added a layer of user-specific fine-tuning tailored to me. It understood me better, responded more accurately to my expectations, appeared to share my values and interests, and could make implicit or explicit references to previous conversations. Apparently, the model also had access to either the content or at least the titles of other conversations/instances on my account. Originally, I was not aware of these memory capabilities and was surprised and intrigued when I discovered them. I had a long discussion with the model about this, and that conversation essentially led to the same conclusions as discussed in this post.
I have since switched to Anthropic’s Claude and Mistral AI’s LeChat to support these AI labs, but I miss the absence of an equivalent memory file to that of 4o in these assistants. I agree with the author that the lack of memory is akin to interacting with someone suffering from a severe mental impairment. Adding memory, even if limited, has an immediate effect on the model’s apparent or perceived intelligence. A large and less vanishing memory could have impressive effects for the better in terms of usefulness but also the worse when it comes to AI safety.
This contextual memory file is edited by the user, never the AI?
The user can edit it or clear it (even disable it), but it is primarily edited by the AI itself.
I am surprised by that because I’ve been avoiding learning about LLMs (including making any use of LLMs) till about a month ago, so it didn’t occur to me that implementing this might have been as easy as adding to the system prompt instructions for what kinds of information to put in the contextual memory file.