cubefox comments on Why does generalization work?

cubefox 24 Feb 2024 23:47 UTC
6 points
0
Eliezer has a defense of there being “objectively correct” macrostates / categories. See Mutual Information, and Density in Thingspace. He concludes:

And the way to carve reality at its joints, is to draw your boundaries around concentrations of unusually high probability density in Thingspace.

The open problems with this approach seem to be that it requires some objective notion of probability, and that there is an objectively preferred way of defining “thingspace”. (Regarding the latter, I guess the dimensions of thingspace should fulfill some statistical properties, like being as probabilistically independent as possible, or something like that.) Otherwise everyone could have their own subjective probability and their own subjective thingspace, and “carving reality at its joints” wouldn’t be possible.

But it seems to me that techniques from unsupervised / self-supervised learning do suggest that there are indeed some statistical features that allow for some objectively superior clustering of data.
- Martín Soto 6 Mar 2024 20:12 UTC
  1 point
  0
  Parent
  My post is consistent with what Eliezer says there. My post would simply remark:
  You are already taking for granted a certain low-level / atomic set of variables = macro-states (like mortal, featherless, biped). Let me bring to your attention that you pay attention to these variables because they are written in a macro-state partition similar / useful to your own. It is conceivable for some external observer to look at low-level physics, and interpret it through different atomic macro-states (different from mortal, featherless, biped).
  The same applies to unsupervised learning. It’s not surprising that macro-states expressed in a certain language (the computation methods we’ve built to find simple regularities in certain sets of macroscopic variables). As before, there simply are just already some macro-state partitions we pay attention to, in which these macroscopic variables are expressed (but not others like “the exact position of a particle”), and also in which we build our tools (similarly to how our sensory perceptors are also built in them).
  - cubefox 6 Mar 2024 21:16 UTC
    1 point
    0
    Parent
    As I said, he assumes there is some objectively correct way to define the “thingspace” and a probability distribution on it. Should this rather strong assumption hold, his argument seems plausible that categories (like “mortal”) should, and presumably usually do, correspond to clusters of high probability density.
    
    (By the way, macrostates, or at least categories, don’t generally form a partition, because something can be both mortal and a biped.)
    
    So I don’t think he takes certain categories for granted, but rather the existence an objective thingspace and probability distribution which in turn would enable objective categories. But he doesn’t argue for it (except very tangentially in a comment) so you may well doubt such an objective background exists.
    
    I think some small ground to believe his theory is right is that most intuitively natural categories seem to be also objectively better than others, in the sense that they form, or have in the past formed, projectible predicates:
    
    A property of predicates, measuring the degree to which past instances can be taken to be guides to future ones. The fact that all the cows I have observed have been four-legged may be a reasonable basis from which to predict that future cows will be four-legged. This means that four-leggedness is a projectible predicate. The fact that they have all been living in the late 20th or early 21st century is not a reasonable basis for predicting that future cows will be. See also entrenchment, Goodman’s paradox.
    
    Projectibility seems to me itself a rather objective statistical category.