My guess is that there are ways you could use 1% of pre-training compute to train a model with near-perfect robust forget accuracy by being more targeted in where you add noise.
Fully agreed! That was exactly the main takeaway of the unlearning research I’ve been doing—trying to make the unlearning updates more targetted/selective was more fruitful than any other approach.
Fully agreed! That was exactly the main takeaway of the unlearning research I’ve been doing—trying to make the unlearning updates more targetted/selective was more fruitful than any other approach.