How do you learn Solomonoff Induction?

I read about a fas­ci­nat­ing tech­nique de­scribed on Wikipe­dia as a math­e­mat­i­cally for­mal­ized com­bi­na­tion of Oc­cam’s ra­zor and the Prin­ci­ple of Mul­ti­ple Ex­pla­na­tions. I want to add this to my toolbox. I’m dream­ing of a con­cise set of ac­tion­able in­struc­tions for us­ing Solomonoff in­duc­tion. I re­al­ize this wish might be overly ideal­is­tic. I’m will­ing to pe­ruse a much more con­voluted tome and will con­sider mak­ing time for any back­ground knowl­edge or pre­req­ui­sites in­volved.

If any­one knows of a good book on this, or can tell me what set of in­for­ma­tion I need to ac­quire, please let me know. It would be much ap­pre­ci­ated!

• Solomonoff In­duc­tion is un­com­putable, and im­ple­ment­ing it will not be pos­si­ble even in prin­ci­ple. It should be un­der­stood as an ideal which you should try to ap­prox­i­mate, rather than some­thing you can ever im­ple­ment.

Solomonoff In­duc­tion is just bayesian episte­mol­ogy with a prior de­ter­mined by in­for­ma­tion the­o­retic com­plex­ity. As an im­perfect agent try­ing to ap­prox­i­mate it, you will get most of your value from sim­ply grokking Bayesian episte­mol­ogy. After you’ve done that, you may want to spend some time think­ing about the philos­o­phy of sci­ence of set­ting pri­ors based on in­for­ma­tion the­o­retic com­plex­ity.

• The clas­sic text­book is Li and Vi­tanyi’s An In­tro­duc­tion to Kol­mogorov Com­plex­ity and Its Ap­pli­ca­tions.

• Solomonoff in­duc­tion is un­com­putable, thus, as a di­rect con­se­quence, it can­not be learned. Some ap­prox­i­ma­tions to it which are of prac­ti­cal in­ter­est: Oc­cam learn­ing and prob­a­bly ap­prox­i­mately cor­rect learn­ing. As a gen­eral mat­ter, these ques­tions are ad­dressed by com­pu­ta­tional learn­ing the­ory.

• Also Yud­kowsky’s ar­ti­cle on Oc­cam’s Ra­zor de­scribes the Oc­cam’s ra­zor/​ sim­plic­ity prior OP was in­ter­ested in.

• con­voluted tome

My book de­scribes a philos­o­phy of sci­ence based on large scale lossless data com­pres­sion. It is not go­ing to give you a toolbox for us­ing SI; as oth­ers have ob­served, SI is of pri­mar­ily the­o­ret­i­cal im­por­tance, since it can’t be com­puted. How­ever, differ­ent as­pects of the book might help ex­pand your wor­ld­view in this area.