You don’t fine-tune on the eval as part of the production model release, you just do a small amount of fine-tuning only for purposes of the eval (i.e. a branch off the production model lineage) to overcome any sandbagging that might be occurring.
You don’t fine-tune on the eval as part of the production model release, you just do a small amount of fine-tuning only for purposes of the eval (i.e. a branch off the production model lineage) to overcome any sandbagging that might be occurring.