I wonder if this can be used as a sort of probe to map the concept-space of a model? E.g. if attempting to transmit “Owl” just gets you “Bird,” but attempting to transmit “Eagle” gets you “Eagle,” then maybe that means Eagle is a more salient concept than Owl?
Are there chains, where e.g. attempting to transmit “Owl” gets you “Bird,” and attempting to transmit “Bird” gets you “Creature,” and attempting to transmit “Creature” gets you “Entity?”
Interesting question. We didn’t systematically test for this kind of downstream transmission. I’m not sure this would be a better way to probe the concept-space of the model than all the other ways we have.
It’s good to have multiple ways to probe the concept-space of the model, because probably none of them are great and so combining them may be a way to get some level of confidence that you are looking at something real. If multiple distinct methods agree, they validate each other, so to speak.
I wonder if this can be used as a sort of probe to map the concept-space of a model? E.g. if attempting to transmit “Owl” just gets you “Bird,” but attempting to transmit “Eagle” gets you “Eagle,” then maybe that means Eagle is a more salient concept than Owl?
Are there chains, where e.g. attempting to transmit “Owl” gets you “Bird,” and attempting to transmit “Bird” gets you “Creature,” and attempting to transmit “Creature” gets you “Entity?”
Interesting question. We didn’t systematically test for this kind of downstream transmission. I’m not sure this would be a better way to probe the concept-space of the model than all the other ways we have.
It’s good to have multiple ways to probe the concept-space of the model, because probably none of them are great and so combining them may be a way to get some level of confidence that you are looking at something real. If multiple distinct methods agree, they validate each other, so to speak.