In the way that AIXI is an abstracted mathematical formalism for (very roughly) “a program that maximizes the expected total rewards received from the environment”, what is the equivalent formalism for an abstracted next token predictor?

Does this exist in the literature? What’s it called? Where can I read about it?

The predictor looks like this:

Training:
[some long series of 0′s and 1′s] --> [training some ML model on this data to minimize loss for next-token prediction] --> [some set of final weights in the ML model.]
Inference:
[Some series of 0′s and 1′s] --> [our trained ML Model] --> [probability distribution over 0,1 for next token.]

The training data should not be random, and should be ‘correlated with the reality you want to predict.’ (The binary output of a real-world sensor at discrete time steps is a good example of the kind of data that’s suitable.)

Any pointers?

[Question] Is there a ‘time series forecasting’ equivalent of AIXI?

Solenoid_Entity17 May 2023 4:35 UTC

12 points

2 comments1 min readLW link

Language Models (LLMs)AI

Zac Hatfield-Dodds 17 May 2023 6:12 UTC
13 points
2
I think you’re looking for Solomonoff Induction, which is the first half of AIXI.
- Charlie Steiner 17 May 2023 7:08 UTC
  5 points
  0
  The classic textbook on it if you want to read more is Li and Vitanyi’s Introduction to Kolmogorov Complexity.

No comments.