One thing that seems really important for agency is perception. And one thing that seems really important for perception is representation learning. Where representation learning involves taking a complex universe (or perhaps rather, complex sense-data) and choosing features of that universe that are useful for modelling things.
When the features are linearly related to the observations/state of the universe, I feel like I have a really good grasp of how to think about this. But most of the time, the features will be nonlinearly related; e.g. in order to do image classication, you use deep neural networks, not principal component analysis.
I feel like it’s an interesting question: where does the nonlinearity come from? Many causal relationships seem essentially linear (especially if you do appropriate changes of variables to help, e.g. taking logarithms; for many purposes, monotonicity can substitute for linearity), and lots of variance in sense-data can be captured through linear means, so it’s not obvious why nonlinearity should be so important.
Here’s some ideas I have so far:
Suppose you have a Gaussian mixture distribution with two Gaussians d1=N(μ1,Σ), d2=N(μ2,Σ) with different means and identical covariances. In this case, the function that separates them optimally is linear. However, if the covariances differed between the Gaussians d1=N(μ1,Σ1), d2=N(μ2,Σ2), then the optimal separating function is nonlinear. So this suggests to me that one reason for nonlinearity is fundamental to perception: nonlinearity is necessary if multiple different processes could be generating the data, and you need to discriminate between the processes themselves. This seems important for something like vision, where you don’t observe the system itself, but instead observe light that bounced off the system.
Consider the notion of the habitable zone of a solar system; it’s the range in which liquid water can exist. Get too close to the star and the water will freeze, get too far and it will boil. Here, it seems like we have two monotonic effects which add up, but because the effects aren’t linear, the result can be nonmonotonic.
Many aspects of the universe are fundamentally nonlinear. But they tend to exist on tiny scales, and those tiny scales tend to mostly get loss to chaotic noise, which tends to turn things linear. However, there are things that don’t get lost to noise, e.g. due to conservation laws; these provide fundamental sources of nonlinearity in the universe.
… and actually, most of the universe is pretty linear? The vast majority of the universe is ~empty space; there isn’t much complex nonlinearity that is happening there, just waves and particles zipping around. If we disregard the empty space, then I believe (might be wrong) that the vast majority is stars. Obviously lots of stuff is going on within stars, but all of the details get lost to the high energies, so it is mostly simple monotonic relations that are left. It seems that perhaps nonlinearity tends to live on tiny boundaries between linear domains. The main reason thing that makes these tiny boundaries so relevant, such that we can’t just forget about them and model everything in piecewise linear/piecewise monotonic ways, is that we live in the boundary.
There is of course a lot of nonlinearity in organisms and other optimized systems, but I believe they result from the world containing the various factors listed above? Idk, it’s possible I’ve missed some.
It seems like it would be nice to develop a theory on sources of nonlinearity. This would make it clearer why sometimes selecting features linearly seems to work (e.g. consider IQ tests), and sometimes it doesn’t.
One thing that seems really important for agency is perception. And one thing that seems really important for perception is representation learning. Where representation learning involves taking a complex universe (or perhaps rather, complex sense-data) and choosing features of that universe that are useful for modelling things.
When the features are linearly related to the observations/state of the universe, I feel like I have a really good grasp of how to think about this. But most of the time, the features will be nonlinearly related; e.g. in order to do image classication, you use deep neural networks, not principal component analysis.
I feel like it’s an interesting question: where does the nonlinearity come from? Many causal relationships seem essentially linear (especially if you do appropriate changes of variables to help, e.g. taking logarithms; for many purposes, monotonicity can substitute for linearity), and lots of variance in sense-data can be captured through linear means, so it’s not obvious why nonlinearity should be so important.
Here’s some ideas I have so far:
Suppose you have a Gaussian mixture distribution with two Gaussians d1=N(μ1,Σ), d2=N(μ2,Σ) with different means and identical covariances. In this case, the function that separates them optimally is linear. However, if the covariances differed between the Gaussians d1=N(μ1,Σ1), d2=N(μ2,Σ2), then the optimal separating function is nonlinear. So this suggests to me that one reason for nonlinearity is fundamental to perception: nonlinearity is necessary if multiple different processes could be generating the data, and you need to discriminate between the processes themselves. This seems important for something like vision, where you don’t observe the system itself, but instead observe light that bounced off the system.
Consider the notion of the habitable zone of a solar system; it’s the range in which liquid water can exist. Get too close to the star and the water will freeze, get too far and it will boil. Here, it seems like we have two monotonic effects which add up, but because the effects aren’t linear, the result can be nonmonotonic.
Many aspects of the universe are fundamentally nonlinear. But they tend to exist on tiny scales, and those tiny scales tend to mostly get loss to chaotic noise, which tends to turn things linear. However, there are things that don’t get lost to noise, e.g. due to conservation laws; these provide fundamental sources of nonlinearity in the universe.
… and actually, most of the universe is pretty linear? The vast majority of the universe is ~empty space; there isn’t much complex nonlinearity that is happening there, just waves and particles zipping around. If we disregard the empty space, then I believe (might be wrong) that the vast majority is stars. Obviously lots of stuff is going on within stars, but all of the details get lost to the high energies, so it is mostly simple monotonic relations that are left. It seems that perhaps nonlinearity tends to live on tiny boundaries between linear domains. The main reason thing that makes these tiny boundaries so relevant, such that we can’t just forget about them and model everything in piecewise linear/piecewise monotonic ways, is that we live in the boundary.
Another major thing: It’s hard to persist information in linear contexts, because it gets lost to noise. Whereas nonlinear systems can have multiple stable configurations and therefore persist it for longer.
There is of course a lot of nonlinearity in organisms and other optimized systems, but I believe they result from the world containing the various factors listed above? Idk, it’s possible I’ve missed some.
It seems like it would be nice to develop a theory on sources of nonlinearity. This would make it clearer why sometimes selecting features linearly seems to work (e.g. consider IQ tests), and sometimes it doesn’t.