Complete newbie question: is it possible to construct a version of these models that uses a 3 dimensional vector, instead of the 768 dimensional vector?
From the sound of it, the 768 dimensional vector is basically a constant linear transform of the three PCA components. Can we just declare the linear transform to be a constant array, and only train up the three components that appear to be the most needed? Eg generate the 768 from the PCA?
I do the majority of these, and converged on them independently over the course of decades. I ended up doing most of these because they made my life better.