Suppose I was generating bits with a fair coin, and for each flip I offered a Solomonoff inductor the following bet: +1 points if heads, −1.000001 points if tails.
My suspicion is that the Solomonoff inductor would take the bet an infinite number of times if I kept flipping the coin.
Rather than predicting the Solomonoff inductor and then tricking it, it may suffice to have an infinitely complex string of bits with a known ratio of 1s and 0s. Of course, maybe we’ve been too hasty in thinking that nature can provide fair coins, and the Solomonoff inductor will eventually start predicting quantum mechanics. That’s certainly what it thinks will happen.
How are you converting the inductor into a betting decision?
What you should get from the inductor—after sufficient input—is very close to equal probability for head and tails (probability arising from the weighted sum of surviving programs).
edit: namely, you’ll have (among the craziness of that program soup) a family of programs that relay subsequent bits from the program tape to the output. So you have the family of programs that begin with “print 11010111001...” , and within those, there’s equal number of those that print zero after 11010111001 and those that print 1 .
There’s also various craziness which offsets the probabilities—e.g. a family of programs that do that but start outputting zeroes (or ones) after some very large number of bits. Those aren’t eliminated until they fail to predict the future.
edit2: though, yeah, if you keep decreasing your epsilon here, it’ll accept the bet now and then. If you’re utilizing the prior history of bets it made (e.g. to adjust that small punishment down until it’s willing to bet) , you effectively have an environment with hypercomputation.
My suspicion is that the Solomonoff inductor would take the bet an infinite number of times if I kept flipping the coin.
My suspicion is that you don’t know what you are talking about. Sorry about the frankness, but your mistakes denote some serious confusion about the subject.
A Solomonoff inductor has no concept of betting and certainly can’t decide whether to take a bet. There are a few fair ways to model it as a decision agent: you can take the per-bit posterior probability as the output and reward it according to a strictly proper scoring rule, or take the predicted bits as the output and reward it by the number of correct predictions. In these cases, if the data string is generated by a computable probability distribution (e.g. a fair coin), then the Solomonoff inductor payoff rate will asymptotically (and quickly) converge to or above the expected payoff rate of the best computable stochastic agent.
Before delving any further into this topic, I suggest you study the literature. The book An Introduction to Kolmogorov Complexity and Its Applications by Ming Li and Paul Vitanyi provides an excellent presentation of Kolmogorov complexity and Solomonoff induction, which also covers several resource-bounded variants with practical applications.
A Solomonoff inductor has no concept of betting and certainly can’t decide whether to take a bet.
Assume that we do this the obvious way.
If you like, we can also talk about the number of times that the probability estimate differs from 0.5 by some epsilon.
(The obvious way is “if EV(bet)>0, consider the bet taken.”)
The book An Introduction to Kolmogorov Complexity and Its Applications by Ming Li and Paul Vitanyi
Thanks for the recommendation!
if the data string is generated by a computable probability distribution (e.g. a fair coin), then the Solomonoff inductor payoff rate will asymptotically (and quickly) converge to or above the expected payoff rate of the best computable stochastic agent.
What shows up in the most obvious place—pg 328 of the book—is that the mean squared error decreases faster than 1/n. And in fact the proof method isn’t strong enough to show that the absolute error decreases faster than 1/n (because the derivative of the absolute error jumps discontinuously).
There’s still the possibility that we can “cheat” by allowing our agent to take bets with positive expected value. Taking bets only cares about the absolute error.
But still, maybe there are stronger bounds out there (maybe even later in the same book) that would apply to the absolute error. And I’m definitely not disputing that I’m ignorant :)
(The obvious way is “if EV(bet)>0, consider the bet taken.”)
I don’t think its really obvious. The (retroactively) obvious way to extend SI into a decision agent, is AIXI, which has its own optimality theorems.
What shows up in the most obvious place—pg 328 of the book—is that the mean squared error decreases faster than 1/n. And in fact the proof method isn’t strong enough to show that the absolute error decreases faster than 1/n (because the derivative of the absolute error jumps discontinuously).
The absolute error is just the square root of the squared error. If the squared error decreases faster than 1/n, then the absolute error decreases faster than 1/sqrt(n).
No. You need an hypercomputable environment to consistently beat Solomonoff induction.
Suppose I was generating bits with a fair coin, and for each flip I offered a Solomonoff inductor the following bet: +1 points if heads, −1.000001 points if tails.
My suspicion is that the Solomonoff inductor would take the bet an infinite number of times if I kept flipping the coin.
Rather than predicting the Solomonoff inductor and then tricking it, it may suffice to have an infinitely complex string of bits with a known ratio of 1s and 0s. Of course, maybe we’ve been too hasty in thinking that nature can provide fair coins, and the Solomonoff inductor will eventually start predicting quantum mechanics. That’s certainly what it thinks will happen.
How are you converting the inductor into a betting decision?
What you should get from the inductor—after sufficient input—is very close to equal probability for head and tails (probability arising from the weighted sum of surviving programs).
edit: namely, you’ll have (among the craziness of that program soup) a family of programs that relay subsequent bits from the program tape to the output. So you have the family of programs that begin with “print 11010111001...” , and within those, there’s equal number of those that print zero after 11010111001 and those that print 1 .
There’s also various craziness which offsets the probabilities—e.g. a family of programs that do that but start outputting zeroes (or ones) after some very large number of bits. Those aren’t eliminated until they fail to predict the future.
edit2: though, yeah, if you keep decreasing your epsilon here, it’ll accept the bet now and then. If you’re utilizing the prior history of bets it made (e.g. to adjust that small punishment down until it’s willing to bet) , you effectively have an environment with hypercomputation.
The obvious way—if EV(bet)>0, take bet.
We could also talk about “the number of times the probability assignment is off by epsilon.”
My suspicion is that you don’t know what you are talking about.
Sorry about the frankness, but your mistakes denote some serious confusion about the subject.
A Solomonoff inductor has no concept of betting and certainly can’t decide whether to take a bet.
There are a few fair ways to model it as a decision agent: you can take the per-bit posterior probability as the output and reward it according to a strictly proper scoring rule, or take the predicted bits as the output and reward it by the number of correct predictions.
In these cases, if the data string is generated by a computable probability distribution (e.g. a fair coin), then the Solomonoff inductor payoff rate will asymptotically (and quickly) converge to or above the expected payoff rate of the best computable stochastic agent.
Before delving any further into this topic, I suggest you study the literature. The book An Introduction to Kolmogorov Complexity and Its Applications by Ming Li and Paul Vitanyi provides an excellent presentation of Kolmogorov complexity and Solomonoff induction, which also covers several resource-bounded variants with practical applications.
Assume that we do this the obvious way.
If you like, we can also talk about the number of times that the probability estimate differs from 0.5 by some epsilon.
(The obvious way is “if EV(bet)>0, consider the bet taken.”)
Thanks for the recommendation!
What shows up in the most obvious place—pg 328 of the book—is that the mean squared error decreases faster than 1/n. And in fact the proof method isn’t strong enough to show that the absolute error decreases faster than 1/n (because the derivative of the absolute error jumps discontinuously).
There’s still the possibility that we can “cheat” by allowing our agent to take bets with positive expected value. Taking bets only cares about the absolute error.
But still, maybe there are stronger bounds out there (maybe even later in the same book) that would apply to the absolute error. And I’m definitely not disputing that I’m ignorant :)
I don’t think its really obvious. The (retroactively) obvious way to extend SI into a decision agent, is AIXI, which has its own optimality theorems.
The absolute error is just the square root of the squared error. If the squared error decreases faster than 1/n, then the absolute error decreases faster than 1/sqrt(n).
The point being that if your expected absolute error decreases like 1/n or slower, you make an infinite number of wrong bets.
You keep framing this in terms of bets. I don’t think there is any point in continuing this discussion.