ProgramCrafter comments on Machine Unlearning in Large Language Models: A Comprehensive Survey with Empirical Insights from the Qwen 1.5 1.8B Model

ProgramCrafter 2 Feb 2025 17:57 UTC
3 points
2
It seems that an alternative to AI unlearning is often overlooked: just remove dataset parts which contain sensitive (or, to that matter, false) information or move training on it towards beginning to aid with language syntax only. I don’t think a bit of inference throughout the dataset is any more expensive than training on it.
- saahir.vazirani 2 Feb 2025 18:55 UTC
  3 points
  2
  Parent
  Typically, the information being unlearnt is from the initial training with mass amounts of data from the internet so it may be difficult to pinpoint what exactly to remove while training.