I think it might really be wise to use a canary string, or some other mechanism to hide this kind of knowledge from future (pre)training runs, e.g. https://turntrout.com/dataset-protection
I think it might really be wise to use a canary string, or some other mechanism to hide this kind of knowledge from future (pre)training runs, e.g. https://turntrout.com/dataset-protection