This paper supports the idea of the model being able to self-report an implicit task it has been fine-tuned to do. https://arxiv.org/html/2501.11120v1
This leads me to suspect that rife’s experiments with reporting the implicit acrostic will be successful.
I agree with the decision to test less common words. You may need even more examples? Or maybe just to train for more epochs on the 200 examples.
Forgot to follow up here but turning up the learning rate multiplier to 10 seemed to do the trick without introducing any over-fitting weirdness or instability
This paper supports the idea of the model being able to self-report an implicit task it has been fine-tuned to do. https://arxiv.org/html/2501.11120v1
This leads me to suspect that rife’s experiments with reporting the implicit acrostic will be successful.
I agree with the decision to test less common words. You may need even more examples? Or maybe just to train for more epochs on the 200 examples.
Forgot to follow up here but turning up the learning rate multiplier to 10 seemed to do the trick without introducing any over-fitting weirdness or instability