Simon T comments on Open Thread—Summer 2025

Simon T 26 Jul 2025 5:00 UTC
1 point
0
Hi, everyone. I am entering the AI safety field. I want to contribute by solving the problem of unlearning.
How can we apply unlearning?
1) Make LLMs forget dangerous stuff (e.g. CBRN)
2) Current LLMs know when they’re being benchmarked. So I want to get situational awareness out of them so we can benchmark them nicely.
I’m looking for:
1) Mentor
2) Collaborators
3) Discussions about AI safety