Is it indistinguishable? Is there a way we could test this? I’d assume if Claude is capable of introspection then it’s narratives of how it came to certain replies and responses should allow us to make better and more effective prompts (i.e. allows us to better model Claude). What form might this experiment take?
Is it indistinguishable? Is there a way we could test this? I’d assume if Claude is capable of introspection then it’s narratives of how it came to certain replies and responses should allow us to make better and more effective prompts (i.e. allows us to better model Claude). What form might this experiment take?