One warning is that LLMs generally suck at prompt writing.
I notice that I am surprised by how mildly you phrase this. Many of my “how can something this smart be this incredibly stupid?” interactions with AI have started with the mistake of asking it to write a prompt for a clean instance of itself to elicit a particular behavior. “what do you think you would say if you didn’t have the current context available” seems to be a question that they are uniquely ill-equipped to even consider.
“IMPORTANT: Skip sycophantic flattery; avoid hollow praise and empty validation. Probe my assumptions, surface bias, present counter‑evidence, challenge emotional framing, and disagree openly when warranted; agreement must be earned through reason.”
I notice that you’re structuring this as some “don’t” and then a lot of “do”. Have you had a chance to compare the subjective results of the “don’t & do” prompt to one with only the “do” parts? I’m curious what if any value the negatively framed parts are adding.
I notice that I am surprised by how mildly you phrase this. Many of my “how can something this smart be this incredibly stupid?” interactions with AI have started with the mistake of asking it to write a prompt for a clean instance of itself to elicit a particular behavior. “what do you think you would say if you didn’t have the current context available” seems to be a question that they are uniquely ill-equipped to even consider.
I notice that you’re structuring this as some “don’t” and then a lot of “do”. Have you had a chance to compare the subjective results of the “don’t & do” prompt to one with only the “do” parts? I’m curious what if any value the negatively framed parts are adding.