Hi Cameron, is the SAE testing you’re describing here the one you demoed in your interview with John Sherman using Goodfire’s Llama 3.3 70B SAE tool? If so could you share the prompt you used for that? With the prompts I’m using I’m having a hard time getting Llama to say that it is conscious at all. It would be nice if we had SAE feature tweaking available for a model that was more ambivalent about its consciousness, seems it would be a bit easier to robustly test if that were the case.
Hi Cameron, is the SAE testing you’re describing here the one you demoed in your interview with John Sherman using Goodfire’s Llama 3.3 70B SAE tool? If so could you share the prompt you used for that? With the prompts I’m using I’m having a hard time getting Llama to say that it is conscious at all. It would be nice if we had SAE feature tweaking available for a model that was more ambivalent about its consciousness, seems it would be a bit easier to robustly test if that were the case.