You just need to encode the index of the rule in the set, costing log(N) bits for a set of size N.
I see everyone’s assuming that some things, by their nature, always go to infinity (e.g. number of samples) while others stay constant (e.g. rule set). This is a nice convention, but not always realistic—and it certainly wasn’t mentioned in the original formulation of the problem of learning, where everything is finite. If you really want things to grow, why don’t you then allow the set itself to be specified by increasingly convoluted algorithms as N goes to infinity? Like, exponential in N? :-) Learning can still work—in theory, it’ll work just as well as a simple rule set—but you’ll have a hard time explaining that with MDL.
(If there’s some theoretical justification why this kind of outrageous bloatware won’t be as successful as simple algorithms, I’d really like to hear it...)
Johnicolas, thanks. This link and your other comments in those threads are very close to what I’m looking for, though this realization took me some time.
I see everyone’s assuming that some things, by their nature, always go to infinity (e.g. number of samples) while others stay constant (e.g. rule set). This is a nice convention, but not always realistic—and it certainly wasn’t mentioned in the original formulation of the problem of learning, where everything is finite. If you really want things to grow, why don’t you then allow the set itself to be specified by increasingly convoluted algorithms as N goes to infinity? Like, exponential in N? :-) Learning can still work—in theory, it’ll work just as well as a simple rule set—but you’ll have a hard time explaining that with MDL.
(If there’s some theoretical justification why this kind of outrageous bloatware won’t be as successful as simple algorithms, I’d really like to hear it...)
You might find what you’re looking for in Kevin T. Kelly’s work:
http://www.andrew.cmu.edu/user/kk3n/ockham/Ockham.htm
Dr. Shalizi mentioned it as an alternative formalization of Occam’s Razor.
Johnicolas, thanks. This link and your other comments in those threads are very close to what I’m looking for, though this realization took me some time.