LLMs (probably) have a drive to simulate a coherent entity
Maybe we can just prepend a bunch of examples of aligned behaviour before a prompt, presented as if the model had done this itself, and see if that improves its behaviour.
LLMs (probably) have a drive to simulate a coherent entity
Maybe we can just prepend a bunch of examples of aligned behaviour before a prompt, presented as if the model had done this itself, and see if that improves its behaviour.