Yes, that’s exactly what I mean! If we have word2vec like properties, steering and interpretability would be much easier and more reliable. And I do think it’s a research direction that is prospective, but not certain.
Facebook also did an interesting tokenizer, that makes LLM’s operating in a much richer embeddings space: https://github.com/facebookresearch/blt. They embed sentences split by entropy/surprise. So it might be another way to test the hypothesis that a better embedding space would provide ice Word2Vec like properties.
Yes, that’s exactly what I mean! If we have word2vec like properties, steering and interpretability would be much easier and more reliable. And I do think it’s a research direction that is prospective, but not certain.
Facebook also did an interesting tokenizer, that makes LLM’s operating in a much richer embeddings space: https://github.com/facebookresearch/blt. They embed sentences split by entropy/surprise. So it might be another way to test the hypothesis that a better embedding space would provide ice Word2Vec like properties.