AI Safety Thursday: Modeling and Detecting Deceptive Alignment

Annie Szorkin gives a talk on Modeling and Detecting Deceptive Alignment

Event Schedule
6:00 to 6:30 - Food and introductions
6:30 to 7:30 - Presentation and Q&A
7:30 to 9:00 - Open Discussions

If you can’t attend in person, join our live stream starting at 6:30 pm via this link.

This is part of our weekly AI Safety Thursdays series. Join us in examining questions like:

  • How do we ensure AI systems are aligned with human interests?

  • How do we measure and mitigate potential risks from advanced AI systems?

  • What does safer AI development look like?

No comments.