This seems interesting and important. But I’m not really understanding your graph. How do I interpret action confidence? And what are the color variations portraying? Maybe editing in a legend would be useful to others too.
Thanks for reading. You’re right, I’ll actually delete it until I can generate slightly better graphs. And until I’m more sure of what it’s showing
FWIW, the green means it’s steered towards being honest, red dishonest and grey ha no steering. The Triangle is when thinking stops. But yeah I needed clearer graphs and to try it on another model.
This seems interesting and important. But I’m not really understanding your graph. How do I interpret action confidence? And what are the color variations portraying? Maybe editing in a legend would be useful to others too.
Thanks for reading. You’re right, I’ll actually delete it until I can generate slightly better graphs. And until I’m more sure of what it’s showing
FWIW, the green means it’s steered towards being honest, red dishonest and grey ha no steering. The Triangle is when thinking stops. But yeah I needed clearer graphs and to try it on another model.