RSS

Ma­chine Unlearning

TagLast edit: 23 Oct 2023 17:15 UTC by NickyP

In Machine Unlearning, the aim is to reduce performance on some “unlearned” tasks, while keeping performance on some “retained” tasks. While traditionally used in the context of privacy preservation and GDPR, some of the research is relevant to the field of AI Interpretability. Here is some terminology often used in the machine unlearning literature. (note that there can be some minor differences in use):


For an overview, one can look at “A Survey of Machine Unlearning

Deep For­get­ting & Un­learn­ing for Safely-Scoped LLMs

scasper5 Dec 2023 16:48 UTC
122 points
29 comments13 min readLW link

Ma­chine Un­learn­ing Eval­u­a­tions as In­ter­pretabil­ity Benchmarks

23 Oct 2023 16:33 UTC
33 points
2 comments11 min readLW link

Un­learn­ing via RMU is mostly shallow

23 Jul 2024 16:07 UTC
50 points
3 comments6 min readLW link

Break­ing Cir­cuit Breakers

14 Jul 2024 18:57 UTC
52 points
13 comments1 min readLW link
(confirmlabs.org)
No comments.