What’s interesting about “Thingspace” (I sometimes call it “orderspace”) is that it flattens out all the different combinations of properties into a mutually exclusive space of points. An observable “thing” in the universe can’t be classified in two different points in Thingspace. Yes you can have a range in Thingspace representing your uncertainty about the classification (If you’re a mere mortal you always have this error bar) but the piece-of-universe-order you are trying to classify is in ideal terms only one point in the space.
IMO this could explain the way we deal with causality. Why do we say effects have only one cause? Where does the Principle of Sufficient Reason come from? The universe is not actually quantized in pieces that have isolated effects on each other. However, causes and effects are “things”, they are points in Thingspace and as “things” they actually represent aggregates, bunches of variable values that when recognized as a whole have, by definition, unique cause-effect relationships with other “things”. I see causality as arrows from one area of thing space to another. Some have tried to account for causality with complex Bayesian networks based on graph theory that are hard to compute. But I think applying causality to labeled clusters in Thingspace instead of trying to apply it to entangled real values seems simpler and more accurate. And you can do it at different levels of granularity to account for uncertainty. The space is then most useful classified hierarchically into an ontology. Uncertainty about classification is then represented by using bigger, vaguer, all encompassing clusters or “categories” in the Thingspace and high level of certainty is represented by a specified small area.
I once tried (and pretty much failed) to create a novel machine learning algorithm based on a causality model between hierarchical EM clusters. I’m not sure why it failed. It was simple and beautiful but I had to use greedy approaches to reduce complexity which might have broken my EM-algorithm. Well at least it (just barely) got me a masters degree. I still believe in my approach and I hope someone will figure it out some day. I’ve been reading and questioning the assumptions underlying all of this lately and specially pondering the link between the physical universe and probability theory and I got stuck at the problem of the arrow of time which seems to be the unifying principle but which also seems not that well understood. A well… maybe in another life.
Why would more uncertainty = bigger cluster? Wouldn’t uncertainty be expressed by using smaller clusters? I.e. if you’re uncertain about a cluster you fall-back on a smaller subset of things that you are more certain pertain to that classification?