Hmm, there is a related thing called “intervention scoring” ( https://arxiv.org/abs/2410.13928 ) but this appears to be only for scoring the descriptions produced by the traditional method, not using interventions to generate the descriptions in the first place.
Hmm, there is a related thing called “intervention scoring” ( https://arxiv.org/abs/2410.13928 ) but this appears to be only for scoring the descriptions produced by the traditional method, not using interventions to generate the descriptions in the first place.