I think your idea of labelling the source and epistemic status of all training data is good. I’ve seen the idea presented before. I don’t think it’s a knockdown solution to model reliability, but I expect it would help a lot.
Sorry that I don’t remember exactly where I saw this discussed before. Hopefully it you web search for it you can find it.
I’ve been continuing to think about this. I eventually remembered that the place I remembered the idea from was pervade conversations with other AI researchers! Sorry to have sent you on a wild goose chase!
It would be interesting to try an experiment with this. Perhaps doing hierarchical clustering on Open Web Text (an early not-too-huge dataset from GPT-2 days). Then get an LLM worth a large context window to review a random subset of each cluster and write a description of it (including an estimate of factual validity). Then, when training, those descriptions would be non-predicted context given to the model. If you do use hierarchical clustering, this will result in a general description and some specific subtype descriptions for every datapoint.
Yes, it’s simple enough, that I imagine it’s likely people came up with it before. But it fixes a flaw in the other idea (which is also simple, although in the previous discussion I was told that it might be novel)
I think your idea of labelling the source and epistemic status of all training data is good. I’ve seen the idea presented before. I don’t think it’s a knockdown solution to model reliability, but I expect it would help a lot.
Sorry that I don’t remember exactly where I saw this discussed before. Hopefully it you web search for it you can find it.
I’m not finding anything. Do you recall the authors? Presented at a conference? Year perhaps? Specific keywords? (I tried the obvious)
Hmmm. I don’t remember. But here’s a new example that Zvi just mentioned: https://arxiv.org/abs/2310.15047
Thanks! It looks interesting. Although I think it’s different from what I was talking about.
I’ve been continuing to think about this. I eventually remembered that the place I remembered the idea from was pervade conversations with other AI researchers! Sorry to have sent you on a wild goose chase!
It would be interesting to try an experiment with this. Perhaps doing hierarchical clustering on Open Web Text (an early not-too-huge dataset from GPT-2 days). Then get an LLM worth a large context window to review a random subset of each cluster and write a description of it (including an estimate of factual validity). Then, when training, those descriptions would be non-predicted context given to the model. If you do use hierarchical clustering, this will result in a general description and some specific subtype descriptions for every datapoint.
Yes, it’s simple enough, that I imagine it’s likely people came up with it before. But it fixes a flaw in the other idea (which is also simple, although in the previous discussion I was told that it might be novel)