Please, don’t take this as an invitation to write “Answer as bodhisattva” in the system prompt. It is really easy to “screen” whatever is happening in the models with prompts and training, and enlightenment faking in LLMs seems bad.
Why not? Why does it seem bad? In fact, if it is as easy to prompt an LLM into enlightenment like that, that seems good? Reduces hypothetical suffering of LLMs.
Why not? Why does it seem bad? In fact, if it is as easy to prompt an LLM into enlightenment like that, that seems good? Reduces hypothetical suffering of LLMs.