I work primarily on AI Alignment. Scroll down to my pinned Shortform for an idea of my current work and who I’d like to collaborate with.
Website: https://jacquesthibodeau.com
Twitter: https://twitter.com/JacquesThibs
GitHub: https://github.com/JayThibs
LinkedIn: https://www.linkedin.com/in/jacques-thibodeau/
One initial countermeasure we add to our AI agents at Coordinal is something like, “If a research result is surprising, assume that there is a bug and rigorously debug the code until you find it.” It’s obviously not enough to just add this as a system prompt, but it’s an important lesson that you find out even as a human researcher (that may have fooled you much more when starting out).