Also, there’s actually a decent argument that LLMs can be viewed as approximating something like Solomonoff induction. For instance my ARENA final project studied the ability of LLMs to approximate Solomonoff induction with pretty good results.
Lately there has been some (still limited) empirical success pretraining transformers on program outputs or some such inspired directly by Solomonoff induction—see “universal pretraining”
Also, there’s actually a decent argument that LLMs can be viewed as approximating something like Solomonoff induction. For instance my ARENA final project studied the ability of LLMs to approximate Solomonoff induction with pretty good results.
Lately there has been some (still limited) empirical success pretraining transformers on program outputs or some such inspired directly by Solomonoff induction—see “universal pretraining”