[Question] Any work on honeypots (to detect treacherous turn attempts)?

I know the idea of making a “honeypot” to detect when an AI system would attempt a treacherous turn if given the opportunity has been discussed (e.g. IIRC, in Superintelligence). But is there anyone actually working on this? Or any work that’s been published?