I did some experiments along these lines, but with a fairly different prompt. I got some reproducible success at filtering narrative injection (though not infallible, and a bit overzealous). My initial run was seemingly successful at filtering bullshit jargon as well, but I was disappointed in being unable to replicate success at this task.
I did some experiments along these lines, but with a fairly different prompt. I got some reproducible success at filtering narrative injection (though not infallible, and a bit overzealous). My initial run was seemingly successful at filtering bullshit jargon as well, but I was disappointed in being unable to replicate success at this task.
https://twitter.com/OrionJohnston/status/1599361215188570114
https://twitter.com/OrionJohnston/status/1599553826046218240