I am a few months into trying this. It tentatively seems to be going well, but will be more confident once I have succeeded/failed at publishing the paper I’m currently working on.
I tried that route as well, delved semi deeply into an alignment-objectives-adjacent subject for ~2 months, but wasn’t happy with the EV and length of feedback loop. My timelines are too short.
I am a few months into trying this. It tentatively seems to be going well, but will be more confident once I have succeeded/failed at publishing the paper I’m currently working on.
I tried that route as well, delved semi deeply into an alignment-objectives-adjacent subject for ~2 months, but wasn’t happy with the EV and length of feedback loop. My timelines are too short.
What topic is your paper about?
Emergent misalignment, specifically focused on the internal geometric representation