As an experiment I like it. The difficult and nitty gritty part I see is getting consistency across all the articles in the first iteration. Even if the risk of it tailoring it’s articles for any specific user, pandering to their particular profile, the output will be beholden to whatever meta-prompt is being used.
And I don’t know enough about the quality of training data to know if it is possible to get such consistency out of it: as in consistent editorial guidelines.
As someone with no knolwedge of how LLMs work beyond some vague stuff about “tokens” “Multilayer perceptrons”, I wonder also, will any given article be simply biased towards the “average” or most popular or common facts or repeated memes about the articles topic as found in the training data, or does every prompt in effect throttle to a certain amount of training data.
Let me put it another way, it’s not very hard to find online the “Hanging Munchkin” myth. There’s certainly a lot of pixels spilled on that topic. Now imagine that this was a disproportionate amount of the training data about the Wizard of Oz—would this be reflected in the article about the Wizard of Oz? Or would the prompt be engineered to ensure such spurious legends won’t be included? And I think ensuring that same editorial consistency on all topics gets really hard and requires a lot of bespoke prompt-engineering.
As an experiment I like it. The difficult and nitty gritty part I see is getting consistency across all the articles in the first iteration. Even if the risk of it tailoring it’s articles for any specific user, pandering to their particular profile, the output will be beholden to whatever meta-prompt is being used.
And I don’t know enough about the quality of training data to know if it is possible to get such consistency out of it: as in consistent editorial guidelines.
As someone with no knolwedge of how LLMs work beyond some vague stuff about “tokens” “Multilayer perceptrons”, I wonder also, will any given article be simply biased towards the “average” or most popular or common facts or repeated memes about the articles topic as found in the training data, or does every prompt in effect throttle to a certain amount of training data.
Let me put it another way, it’s not very hard to find online the “Hanging Munchkin” myth. There’s certainly a lot of pixels spilled on that topic. Now imagine that this was a disproportionate amount of the training data about the Wizard of Oz—would this be reflected in the article about the Wizard of Oz? Or would the prompt be engineered to ensure such spurious legends won’t be included? And I think ensuring that same editorial consistency on all topics gets really hard and requires a lot of bespoke prompt-engineering.