Also notable: the big OpenAI reveal today was some sort of better personalization. Instead of the crude ‘saved facts’ personalization ChatGPT has had for a long time and which has never made much of a difference, they’re doing… something. Unclear if it’s merely RAG or if they are also doing something interesting like lightweight finetuning. But the GPTs definitely seem to have much better access to your other sessions in the web interface, and as far as I know, few other interfaces with frontier models have tried to do much personalization, so this will be an interesting real-world test at scale about how much simple personalization can help with LLMs (similar to Midjourney’s relatively new personalization feature, which I get a lot out of).
It’s almost certainly untrue that OpenAI is doing per-user finetuning for personalization.
Each user-tuned model (or even a LoRA adapter) must stick to the node where it was trained. You lose the ability to route traffic to any idle instance. OpenAI explicitly optimises for pooling capacity across huge fleets; tying sessions to devices would crater utilisation.
I agree at this point: it is not per-user finetuning. The personalization has been prodded heavily, and it seems to boil down to a standard RAG interface plus a slightly interesting ‘summarization’ approach to try to describe the user in text (as opposed to a ‘user embedding’ or something else). I have not seen any signs of either lightweight or full finetuning, and several observations strongly cut against it: for example, users describe a ‘discrete’ behavior where the current GPT either knows something from another session, or it doesn’t, but it is never ‘in between’, and it only seems to draw on a few other sessions at any time; this points to RAG as the workhorse (the relevant other snippet either got retrieved or it didn’t), rather than any kind of finetuning where you would expect ‘fuzzy’ recall and signs of information leaking in from all recent sessions.
Perhaps for that reason, it has not made a big impact (at least once people got over the narcissistic rush of asking GPT about the summary of you, either flatteringly sycophantic or not). It presumably is quietly helping behind the scenes, but I haven’t noticed any clear big benefits to it. (And there are some drawbacks.)
Also notable: the big OpenAI reveal today was some sort of better personalization. Instead of the crude ‘saved facts’ personalization ChatGPT has had for a long time and which has never made much of a difference, they’re doing… something. Unclear if it’s merely RAG or if they are also doing something interesting like lightweight finetuning. But the GPTs definitely seem to have much better access to your other sessions in the web interface, and as far as I know, few other interfaces with frontier models have tried to do much personalization, so this will be an interesting real-world test at scale about how much simple personalization can help with LLMs (similar to Midjourney’s relatively new personalization feature, which I get a lot out of).
It’s almost certainly untrue that OpenAI is doing per-user finetuning for personalization.
Each user-tuned model (or even a LoRA adapter) must stick to the node where it was trained. You lose the ability to route traffic to any idle instance. OpenAI explicitly optimises for pooling capacity across huge fleets; tying sessions to devices would crater utilisation.
I agree at this point: it is not per-user finetuning. The personalization has been prodded heavily, and it seems to boil down to a standard RAG interface plus a slightly interesting ‘summarization’ approach to try to describe the user in text (as opposed to a ‘user embedding’ or something else). I have not seen any signs of either lightweight or full finetuning, and several observations strongly cut against it: for example, users describe a ‘discrete’ behavior where the current GPT either knows something from another session, or it doesn’t, but it is never ‘in between’, and it only seems to draw on a few other sessions at any time; this points to RAG as the workhorse (the relevant other snippet either got retrieved or it didn’t), rather than any kind of finetuning where you would expect ‘fuzzy’ recall and signs of information leaking in from all recent sessions.
Perhaps for that reason, it has not made a big impact (at least once people got over the narcissistic rush of asking GPT about the summary of you, either flatteringly sycophantic or not). It presumably is quietly helping behind the scenes, but I haven’t noticed any clear big benefits to it. (And there are some drawbacks.)