Seems difficult with prompt engineering, with today’s context windows. Maybe possible with RLHF if you can reliably find and hire people who can identify good content from bad.
If the fate of my company depended on me building this feature by tomorrow, I would take a bunch of existing spam and low quality comments, create embeddings for them via the OpenAI api, average the vector, and then auto-flag new comments if their embedding vector too close to the spam vector.
Seems difficult with prompt engineering, with today’s context windows.
Maybe possible with RLHF if you can reliably find and hire people who can identify good content from bad.
If the fate of my company depended on me building this feature by tomorrow, I would take a bunch of existing spam and low quality comments, create embeddings for them via the OpenAI api, average the vector, and then auto-flag new comments if their embedding vector too close to the spam vector.