Another analogy: consider this clustering problem.
Different clustering algorithms will indeed find slightly different parameterizations of the clusters, slightly different cluster membership probabilities, etc. But those differences will be slight differences. We still expect different algorithms to cluster things in one of a few discrete ways—e.g. identifying the six main clusters, or only two (top and bottom, projected onto y-axis), or three (left, middle, right, projected onto x-axis), maybe just finding one big cluster if it’s a pretty shitty algorithm, etc. We would not expect to see an entire continuum of different clusters found, where the continuum ranges from “all six separate” to “one big cluster”; we would expect a discrete difference between those two clusterings.
The key there is “slightly different”.
Another analogy: consider this clustering problem.
Different clustering algorithms will indeed find slightly different parameterizations of the clusters, slightly different cluster membership probabilities, etc. But those differences will be slight differences. We still expect different algorithms to cluster things in one of a few discrete ways—e.g. identifying the six main clusters, or only two (top and bottom, projected onto y-axis), or three (left, middle, right, projected onto x-axis), maybe just finding one big cluster if it’s a pretty shitty algorithm, etc. We would not expect to see an entire continuum of different clusters found, where the continuum ranges from “all six separate” to “one big cluster”; we would expect a discrete difference between those two clusterings.