I don’t think “definitions” are the crux of my discomfort. Suppose the model learns a cluster; the position, scale, and shape parameters of this cluster summary are not perfectly stable—that is, they vary somewhat with different training data. This is not a problem on its own, because it’s still basically the same; however, the (fuzzy) boundary of the cluster is large (I have a vague intuition that the curse of dimensionality is relevant here, but nothing solid). This means that there are many cutting planes, induced by actions to be taken downstream of the model, on which training on different data could have yielded a different result. My intuition is that most of the risk of misalignment arises at those boundaries:
One reason for my intuition is that in communication between humans, difficulties arise in a similar way (i.e. when two peoples clusters have slightly different shapes)
One reason is that the boundary cases feel like the kind of stuff you can’t reliably learn from data or effectively test.
Your comment seems to be suggesting that you think the edge cases won’t matter, but I’m not really understanding why the fuzzy nature of concepts makes that true.
I don’t think “definitions” are the crux of my discomfort. Suppose the model learns a cluster; the position, scale, and shape parameters of this cluster summary are not perfectly stable—that is, they vary somewhat with different training data. This is not a problem on its own, because it’s still basically the same; however, the (fuzzy) boundary of the cluster is large (I have a vague intuition that the curse of dimensionality is relevant here, but nothing solid). This means that there are many cutting planes, induced by actions to be taken downstream of the model, on which training on different data could have yielded a different result. My intuition is that most of the risk of misalignment arises at those boundaries:
One reason for my intuition is that in communication between humans, difficulties arise in a similar way (i.e. when two peoples clusters have slightly different shapes)
One reason is that the boundary cases feel like the kind of stuff you can’t reliably learn from data or effectively test.
Your comment seems to be suggesting that you think the edge cases won’t matter, but I’m not really understanding why the fuzzy nature of concepts makes that true.