Really cool project! And the write-up is very clear.
In the section about options for reducing the hit to helpfulness, I was surprised you didn’t mention scaling the vector you’re adding or subtracting—did you try different weights? I would expect that you can tune the strength of the intervention by weighting the difference in means vector up or down.
Really cool project! And the write-up is very clear.
In the section about options for reducing the hit to helpfulness, I was surprised you didn’t mention scaling the vector you’re adding or subtracting—did you try different weights? I would expect that you can tune the strength of the intervention by weighting the difference in means vector up or down.
Thanks, that is a great suggestion! I will add it to the potential solutions to the problem.