What do you mean? As in you would filter specific data from the posttraining step? What would you be trying to prevent the model from learning specifically?
Maybe, but not at any large scale, though I get why someone might not want to do it, because it’d probably be costly to do this as an intentional effort to stably make an AI aligned (unless synthetic data automation more or less works.)
What do you mean? As in you would filter specific data from the posttraining step? What would you be trying to prevent the model from learning specifically?
I was thinking of adding synthetic data about our values in the pretraining step.
Is anyone doing this?
Maybe, but not at any large scale, though I get why someone might not want to do it, because it’d probably be costly to do this as an intentional effort to stably make an AI aligned (unless synthetic data automation more or less works.)