Dumb idea for dealing with distribution shift in Alignment:
Use your alignment scheme to train the model on a much wider distribution than deployment; this is one of the techniques used to ensure proper generalisation of training of quadruped robots in this paper.
It seems to me that if you make your training distribution wide enough this should be sufficient to cover any deployment distribution shift.
I fully expect to be wrong and look forward to finding out why in the comments.
Dumb idea for dealing with distribution shift in Alignment:
Use your alignment scheme to train the model on a much wider distribution than deployment; this is one of the techniques used to ensure proper generalisation of training of quadruped robots in this paper.
It seems to me that if you make your training distribution wide enough this should be sufficient to cover any deployment distribution shift.
I fully expect to be wrong and look forward to finding out why in the comments.