I’m a senior software engineer with 20 years experience who’s concerned about AI safety. I bring practical engineering judgment, security mindset, and understanding of how real systems fail. I’m doing an MSc to formalize my ML knowledge and enter the field.
Currently working on my LLM Scheming and Evaluation Awareness dissertation.
Please could you elaborate. My comment is my first impression from reading this article, but I’m happy to update.
Perhaps this is neither scheming nor incoherence but something in between the two. Systematic but not strategic. Specification gaming?