How you do lossy compression depends on what you want.
I think this is technically true, but less important than it seems at first glance. Natural abstractions are a thing, which means there’s instrumental convergence in abstractions—some compressed information is relevant to a far wider variety of objectives than other compressed information. Representing DNA sequences as strings of four different symbols is a natural abstraction, and it’s useful for a very wide variety of goals; MD5 hashes of those strings are useful only for a relatively narrow set of goals.
Somewhat more formally… any given territory has some Kolmogorov complexity, a maximally-compressed lossless map. That’s a property of the territory alone, independent of any goal. But it’s still relevant to goal-specific lossy compression—it will very often be useful for lossy models to re-use the compression methods relevant to lossless compression.
For instance, maybe we have an ASCII text file which contains only alphanumeric and punctuation characters. We can losslessly compress that file using e.g. Huffman coding, which uses fewer bits for the characters which appear more often. Now we decide to move on to lossy encoding—but we can still use the compressed character representation found by Huffman, assuming the lossy method doesn’t change the distribution of characters too much.
An abstraction like “object permanence” would be useful for a very wide variety of goals, maybe even for any real-world goal. An abstraction like “golgi apparatus” is useful for some goals but not others. “Lossless” is not an option in practice: our world is too rich, you can just keep digging deeper into any phenomenon until you run out of time and memory … I’m sure that a 50,000 page book could theoretically be written about earwax, and it would still leave out details which for some goals would be critical. :-)
I think this is technically true, but less important than it seems at first glance. Natural abstractions are a thing, which means there’s instrumental convergence in abstractions—some compressed information is relevant to a far wider variety of objectives than other compressed information. Representing DNA sequences as strings of four different symbols is a natural abstraction, and it’s useful for a very wide variety of goals; MD5 hashes of those strings are useful only for a relatively narrow set of goals.
Somewhat more formally… any given territory has some Kolmogorov complexity, a maximally-compressed lossless map. That’s a property of the territory alone, independent of any goal. But it’s still relevant to goal-specific lossy compression—it will very often be useful for lossy models to re-use the compression methods relevant to lossless compression.
For instance, maybe we have an ASCII text file which contains only alphanumeric and punctuation characters. We can losslessly compress that file using e.g. Huffman coding, which uses fewer bits for the characters which appear more often. Now we decide to move on to lossy encoding—but we can still use the compressed character representation found by Huffman, assuming the lossy method doesn’t change the distribution of characters too much.
An abstraction like “object permanence” would be useful for a very wide variety of goals, maybe even for any real-world goal. An abstraction like “golgi apparatus” is useful for some goals but not others. “Lossless” is not an option in practice: our world is too rich, you can just keep digging deeper into any phenomenon until you run out of time and memory … I’m sure that a 50,000 page book could theoretically be written about earwax, and it would still leave out details which for some goals would be critical. :-)