Potentially challenging example: let’s say there’s a server that’s bottlenecked on some poorly optimized algorithm, and you optimize it to be less redundant, freeing resources that immediately gets used for a wide range of unknown but presumably good tasks.
Superficially, this seems like an optimization that increased the description length. I believe the way this is solved in the OP is that the distributions are constructed in order to assign an extra long description length to undesirable states, even if these undesirable states are naturally simpler and more homogenous than the desirable ones.
I am quite suspicious that this risks you end up with improper probability distributions. Maybe that’s OK.
Note that the “thing which is compressed” is the optimization target, which is not the whole system. So in your example, it probably wouldn’t be the code itself which is compressed, but rather the runtime of the program (holding the output constant), or maybe the runtimes of a bunch of programs on the machine, or something along those lines.
(Not sure if by “runtime” you mean “time spent running” or “memory/program state during the running time” (or something else)? I was imagining memory/program state in mine, though that is arguably a simplification since the ultimate goal is probably something to do with the business.)
Potentially challenging example: let’s say there’s a server that’s bottlenecked on some poorly optimized algorithm, and you optimize it to be less redundant, freeing resources that immediately gets used for a wide range of unknown but presumably good tasks.
Superficially, this seems like an optimization that increased the description length. I believe the way this is solved in the OP is that the distributions are constructed in order to assign an extra long description length to undesirable states, even if these undesirable states are naturally simpler and more homogenous than the desirable ones.
I am quite suspicious that this risks you end up with improper probability distributions. Maybe that’s OK.
Note that the “thing which is compressed” is the optimization target, which is not the whole system. So in your example, it probably wouldn’t be the code itself which is compressed, but rather the runtime of the program (holding the output constant), or maybe the runtimes of a bunch of programs on the machine, or something along those lines.
(Not sure if by “runtime” you mean “time spent running” or “memory/program state during the running time” (or something else)? I was imagining memory/program state in mine, though that is arguably a simplification since the ultimate goal is probably something to do with the business.)
Time spent running.