Yeah, this could be clearer. The point is that 1/(c(v+ + w−))*(v+ + w−) and 1/(c(v+ + w−))*(v- + w+) are formal sums of elements of L. These formal sums have positive coefficients which sum to 1, so they represent convex combinations. But their not equal as formal sums, only the results of applying the convex combination operation of L are equal.
SamEisenstat(Sam Eisenstat)
We can then quotient by this relation to get a vector space
I think you’re confusing two different parts here. There’s a quotient of a vector space to get a vector space, which is done to embed $\mathcal{L}$ in a vector space. There’s also something sort of like a projectivization, which does not produce a vector space. In the method I prefer, there isn’t an explicit quotient, but instead just functions on the vector space that satisfy certain properties. (I could see being convinced to prefer the other version if it did improve the presentation.)
of differences of lotteries
Is this supposed to be the square of the space of lotteries? The square would correspond to formal differences, but actual differences would be a different space.
The point of my construction with formal differences is that differences of lotteries are not defined a priori. If we embed $\mathcal{L}$ in a vector space then we have already done what my construction is for. This is all in https://link.springer.com/article/10.1007/BF02413910 in some form, and many other places.
Happy to talk more about this.
Q5 is true if (as you assumed), the space of lotteries is the space of distributions over a finite set. (For a general convex set, you can get long-line phenomena.)
First, without proof, I’ll state the following generalization.
Theorem 1. Let be a relation on a convex space satisfying axioms A1, A2, A3, and the following additional continuity axiom. For all , the set
is open in . Then, there exists a function from to the long line such that iff .
The proof is not too different, but simpler, if we also assume A4. In particular, we no longer need the extra continuity axiom, and we get a stronger conclusion. Nate sketched part of the proof of this already, but I want to be clearer about what is stated and skip fewer steps. In particular, I’m not sure how Nate’s hypotheses rule out examples that require long-line-valued functions—maybe he’s assuming that the domain of the preference relation is a finite-dimensional simplex like I am, but none of his arguments use this explicitly.
Theorem 2. Let be a relation on a finite-dimensional simplex satisfying axioms A1-A4. Then, there is a quasiconcave function such that iff .
First, I’ll set up some definitions and a lemma. For any lotteries , , let denote the line segment
We say that preferences are increasing along a line segment if whenever , we have
We will also use open and half-open interval notation in the corresponding way.
Lemma. Let be a preference relation on a finite-dimensional simplex satisfying axioms A1-A4. Then, there are -minimal and -maximal elements in .
Proof. First, we show that there is a minimal element. Axiom A4 states that for any mixture , either or . By induction, it follows more generally that any convex combination C of finitely many elements satisfies for some . But every element is a convex combination of the vertices of , so some vertex of is -minimal.
The proof that there is a maximal element is more complex. Consider the family of sets
This is a prefilter, so since is compact ( here carries the Euclidean metric), it has a cluster point . Either will be a maximal element, or we will find some other maximal element. In particular, take any . We are done if is a maximal element; otherwise, pick . By the construction of , for every , we can pick some within a distance of from B. Now, if we show that itself satisfies , it will follow that is maximal.
The idea is to pass from our sequence , with limit , to another sequence lying on a line segment with endpoint . We can use axiom A4, which is a kind of convexity, to control the preference relation on convex combinations of our points , so these are the points that we will construct along a line segment. Once we have this line segment, we can finish by using A3, which is a kind of continuity restricted to line segments, to control itself.
Let be the set of lotteries in the affine span of the set . Then, if we take some index set such that is a maximal affinely independent tuple, it follows that affinely generates . Hence, the convex combination
i.e. the barycenter of the simplex with vertices at , is in the interior of the convex hull of relative to , so we can pick some such that the -ball around relative to is contained in this simplex.
Now, we will see that every lottery in the set satisfies . For any , pick so that is in the -ball around . Since the tangent vector has length less than , the lottery
is in the -ball around , and it is in , so it is in the simplex with vertices . Then, by A4, and by hypothesis. So, applying A4 again,
Using A4 one more time, it follows that every lottery
satisfies , and hence every lottery .
Now we can finish up. If then, using A3 and the fact that , there would have to be some lottery in that is -equivalent to A, but this would contradict what we just concluded. So, , and so B is -maximal.
Proof of Theorem 2. Let be a -minimal and a -maximal element of . First, we will see that preferences are increasing on , and then we will use this fact to construct a function and show that it has the desired properties. Suppose preferences we not increasing; then, there would be such that is closer to while is closer to , and . Then, would be a convex combination of and , but by the maximality of , contradicting A4.
Now we can construct our utility function using A3; for each -class , we have , so there is some[1] such that
Then, let for all . Since preferences are increasing on , it is immediate that if , then . Conversely, if , we have two cases. If , then , so , and so . Finally, if , then by construction.
Finally, since for all we have iff , it follows immediately that is quasiconcave by A4.
- ^
Nate mentions using choice in his answer, but here at least the use of choice is removable. Since is monotone on , the intersection of the -class with is a subinterval of , so we can pick based on the midpoint of that interval
- ^
Nice, I like this proof also. Maybe there’s a clearer way to say thing, but your “unrolling one step” corresponds to my going from to . We somehow need to “look two possible worlds deep”.
Here’s a simple Kripke frame proof of Payor’s lemma.
Let be a Kripke frame over our language, i.e. is a set of possible worlds, is an accessibility relation, and judges that a sentence holds in a world. Now, suppose for contradiction that but that , i.e. does not hold in some world .
A bit of De Morganing tells us that the hypothesis on is equivalent to , so . So, there is some world with such that . But again looking at our equivalent form for , we see that , so , a contradiction.
Both this proof and the proof in the post are very simple, but at least for me I feel like this proof tells me a bit more about what’s going on, or at least tells me something about what’s going on that the other doesn’t. Though in a broader sense there’s a lot I don’t know about what’s going on in modal fixed points.
Kripke frame-style semantics are helpful for thinking about lots of modal logic things. In particular, there are cool inductiony interpretations of the Gödel/Löb theorems. These are more complicated, but I’d be happy to talk about them sometime.
Theorem. Weak HCH (and similar proposals) contain EXP.
Proof sketch: I give a strategy that H can follow to determine whether some machine that runs in time accepts. Basically, we need to answer questions of the form “Does cell have value at time .” and “Was the head in position at time ?”, where and are bounded by some function in . Using place-system representations of and , these questions have length in , so they can be asked. Further, each question is a simple function of a constant number of other such questions about earlier times as long as , and can be answered directly in the base case .
*I* think that there’s a flaw in the argument.
I could elaborate, but maybe you want to think about this more, so for now I’ll just address your remark about , where is refutable. If we assume that , then, since is false, must be false, so must be true. That is, you have proven that PA proves , that is, since is contradictory, PA proves its own inconsistency. You’re right that this is compatible with PA being consistent—PA may be consistent but prove its own inconsistency—but this should still be worrying.
This reminds me of the Discourse on Method.
[T]here is seldom so much perfection in works composed of many separate parts, upon which different hands had been employed, as in those completed by a single master. Thus it is observable that the buildings which a single architect has planned and executed, are generally more elegant and commodious than those which several have attempted to improve, by making old walls serve for purposes for which they were not originally built. Thus also, those ancient cities which, from being at first only villages, have become, in course of time, large towns, are usually but ill laid out compared with the regularity constructed towns which a professional architect has freely planned on an open plain; so that although the several buildings of the former may often equal or surpass in beauty those of the latter, yet when one observes their indiscriminate juxtaposition, there a large one and here a small, and the consequent crookedness and irregularity of the streets, one is disposed to allege that chance rather than any human will guided by reason must have led to such an arrangement. And if we consider that nevertheless there have been at all times certain officers whose duty it was to see that private buildings contributed to public ornament, the difficulty of reaching high perfection with but the materials of others to operate on, will be readily acknowledged. In the same way I fancied that those nations which, starting from a semi-barbarous state and advancing to civilization by slow degrees, have had their laws successively determined, and, as it were, forced upon them simply by experience of the hurtfulness of particular crimes and disputes, would by this process come to be possessed of less perfect institutions than those which, from the commencement of their association as communities, have followed the appointments of some wise legislator. It is thus quite certain that the constitution of the true religion, the ordinances of which are derived from God, must be incomparably superior to that of every other. And, to speak of human affairs, I believe that the pre-eminence of Sparta was due not to the goodness of each of its laws in particular, for many of these were very strange, and even opposed to good morals, but to the circumstance that, originated by a single individual, they all tended to a single end. In the same way I thought that the sciences contained in books (such of them at least as are made up of probable reasonings, without demonstrations), composed as they are of the opinions of many different individuals massed together, are farther removed from truth than the simple inferences which a man of good sense using his natural and unprejudiced judgment draws respecting the matters of his experience. And because we have all to pass through a state of infancy to manhood, and have been of necessity, for a length of time, governed by our desires and preceptors (whose dictates were frequently conflicting, while neither perhaps always counseled us for the best), I farther concluded that it is almost impossible that our judgments can be so correct or solid as they would have been, had our reason been mature from the moment of our birth, and had we always been guided by it alone.
It is true, however, that it is not customary to pull down all the houses of a town with the single design of rebuilding them differently, and thereby rendering the streets more handsome; but it often happens that a private individual takes down his own with the view of erecting it anew, and that people are even sometimes constrained to this when their houses are in danger of falling from age, or when the foundations are insecure. With this before me by way of example, I was persuaded that it would indeed be preposterous for a private individual to think of reforming a state by fundamentally changing it throughout, and overturning it in order to set it up amended; and the same I thought was true of any similar project for reforming the body of the sciences, or the order of teaching them established in the schools: but as for the opinions which up to that time I had embraced, I thought that I could not do better than resolve at once to sweep them wholly away, that I might afterwards be in a position to admit either others more correct, or even perhaps the same when they had undergone the scrutiny of reason. I firmly believed that in this way I should much better succeed in the conduct of my life, than if I built only upon old foundations, and leaned upon principles which, in my youth, I had taken upon trust. For although I recognized various difficulties in this undertaking, these were not, however, without remedy, nor once to be compared with such as attend the slightest reformation in public affairs. Large bodies, if once overthrown, are with great difficulty set up again, or even kept erect when once seriously shaken, and the fall of such is always disastrous. Then if there are any imperfections in the constitutions of states (and that many such exist the diversity of constitutions is alone sufficient to assure us), custom has without doubt materially smoothed their inconveniences, and has even managed to steer altogether clear of, or insensibly corrected a number which sagacity could not have provided against with equal effect; and, in fine, the defects are almost always more tolerable than the change necessary for their removal; in the same manner that highways which wind among mountains, by being much frequented, become gradually so smooth and commodious, that it is much better to follow them than to seek a straighter path by climbing over the tops of rocks and descending to the bottoms of precipices.
...
And finally, as it is not enough, before commencing to rebuild the house in which we live, that it be pulled down, and materials and builders provided, or that we engage in the work ourselves, according to a plan which we have beforehand carefully drawn out, but as it is likewise necessary that we be furnished with some other house in which we may live commodiously during the operations, so that I might not remain irresolute in my actions, while my reason compelled me to suspend my judgement, and that I might not be prevented from living thenceforward in the greatest possible felicity, I formed a provisory code of morals, composed of three or four maxims, with which I am desirous to make you acquainted.
The first was to obey the laws and customs of my country, adhering firmly to the faith in which, by the grace of God, I had been educated from my childhood and regulating my conduct in every other matter according to the most moderate opinions, and the farthest removed from extremes, which should happen to be adopted in practice with general consent of the most judicious of those among whom I might be living. For as I had from that time begun to hold my own opinions for nought because I wished to subject them all to examination, I was convinced that I could not do better than follow in the meantime the opinions of the most judicious; and although there are some perhaps among the Persians and Chinese as judicious as among ourselves, expediency seemed to dictate that I should regulate my practice conformably to the opinions of those with whom I should have to live; and it appeared to me that, in order to ascertain the real opinions of such, I ought rather to take cognizance of what they practised than of what they said, not only because, in the corruption of our manners, there are few disposed to speak exactly as they believe, but also because very many are not aware of what it is that they really believe; for, as the act of mind by which a thing is believed is different from that by which we know that we believe it, the one act is often found without the other. Also, amid many opinions held in equal repute, I chose always the most moderate, as much for the reason that these are always the most convenient for practice, and probably the best (for all excess is generally vicious), as that, in the event of my falling into error, I might be at less distance from the truth than if, having chosen one of the extremes, it should turn out to be the other which I ought to have adopted. And I placed in the class of extremes especially all promises by which somewhat of our freedom is abridged; not that I disapproved of the laws which, to provide against the instability of men of feeble resolution, when what is sought to be accomplished is some good, permit engagements by vows and contracts binding the parties to persevere in it, or even, for the security of commerce, sanction similar engagements where the purpose sought to be realized is indifferent: but because I did not find anything on earth which was wholly superior to change, and because, for myself in particular, I hoped gradually to perfect my judgments, and not to suffer them to deteriorate, I would have deemed it a grave sin against good sense, if, for the reason that I approved of something at a particular time, I therefore bound myself to hold it for good at a subsequent time, when perhaps it had ceased to be so, or I had ceased to esteem it such.
My second maxim was to be as firm and resolute in my actions as I was able, and not to adhere less steadfastly to the most doubtful opinions, when once adopted, than if they had been highly certain; imitating in this the example of travelers who, when they have lost their way in a forest, ought not to wander from side to side, far less remain in one place, but proceed constantly towards the same side in as straight a line as possible, without changing their direction for slight reasons, although perhaps it might be chance alone which at first determined the selection; for in this way, if they do not exactly reach the point they desire, they will come at least in the end to some place that will probably be preferable to the middle of a forest. In the same way, since in action it frequently happens that no delay is permissible, it is very certain that, when it is not in our power to determine what is true, we ought to act according to what is most probable; and even although we should not remark a greater probability in one opinion than in another, we ought notwithstanding to choose one or the other, and afterwards consider it, in so far as it relates to practice, as no longer dubious, but manifestly true and certain, since the reason by which our choice has been determined is itself possessed of these qualities. This principle was sufficient thenceforward to rid me of all those repentings and pangs of remorse that usually disturb the consciences of such feeble and uncertain minds as, destitute of any clear and determinate principle of choice, allow themselves one day to adopt a course of action as the best, which they abandon the next, as the opposite.
(This is probably 5% of the text. There is more interesting stuff there, but it’s less relevant to this post.)
As you say, this isn’t a proof, but it wouldn’t be too surprising if this were consistent. There is some such that has a proof of length by a result of Pavel Pudlák (On the length of proofs of finitistic consistency statements in first order theories). Here I’m making the dependence on explicit, but not the dependence on . I haven’t looked at it closely, but the proof strategy in Theorems 5.4 and 5.5 suggests that will not depend on , as long as we only ask for the weaker property that will only be provable in length for sentences of length at most .
I misunderstood your proposal, but you don’t need to do this work to get what you want. You can just take each sentence as an axiom, but declare that this axiom takes symbols to invoke. This could be done by changing the notion of length of a proof, or by taking axioms and with very long.
Yeah, I had something along the lines of what Paul said in mind. I wanted not to require that the circuit implement exactly a given function, so that we could see if daemons show up in the output. It seems easier to define daemons if we can just look at input-output behaviour.
I’m having trouble thinking about what it would mean for a circuit to contain daemons such that we could hope for a proof. It would be nice if we could find a simple such definition, but it seems hard to make this intuition precise.
For example, we might say that a circuit contains daemons if it displays more optimization that necessary to solve a problem. Minimal circuits could have daemons under this definition though. Suppose that some function describes the behaviour of some powerful agent, a function is like with noise added, and our problem is to predict sufficiently well the function . Then, the simplest circuit that does well won’t bother to memorize a bunch of noise, so it will pursue the goals of the agent described by more efficiently than , and thus more efficiently than necessary.
Two minor comments. First, the bitstrings that you use do not all correspond to worlds, since, for example, implies , as is a subtheory of . This can be fixed by instead using a tree of sentences that all diagonalize against themselves. Tsvi and I used a construction in this spirit in A limit-computable, self-reflective distribution, for example.
Second, I believe that weakening #2 in this post also cannot be satisfied by any constant distribution. To sketch my reasoning, a trader can try to buy a sequence of sentences , spending $$2^{-n}$ on the \(n\)th sentence \(\phi_1 \wedge \dots \wedge \phi_n\). It should choose the sequence of sentences so that \(\phi_1 \wedge \dots \wedge \phi_n\) has probability at most \(2^{-n}\), and then it will make an infinite amount of money if the sentences are simultaneously true.
The way to do this is to choose each from a list of all sentences. If at any point you notice that has too high a probability, then pick a new sentence for . We can sell all the conjunctions for and get back the original amount payed by hypothesis. Then, if we can keep using sharper continuous tests of the probabilities of the sentences over time, we will settle down to a sequence with the desired property.
In order to turn this sketch into a proof, we need to be more careful about how these things are to be made continuous, but it seems routine.
I at first didn’t understand your argument for claim (2), so I wrote an alternate proof that’s a bit more obvious/careful. I now see why it works, but I’ll give my version below for anyone interested. In any case, what you really mean is the probability of deciding a sentence outside of by having it announced by nature; there may be a high probability of sentences being decided indirectly via sentences in .
Instead of choosing as you describe, pick so that the probability of sampling something in is greater than . Then, the probability of sampling something in is at least . Hence, no matter what sentences have been decided already, the probability that repeatedly sampling from selects before it selects any sentence outside of is at least
as desired.
Furthermore, this argument makes it clear that the probability distribution we converge to depends only on the set of sentences which the environment will eventually assert, not on their ordering!
Oh, I didn’t notice that aspect of things. That’s pretty cool.
A few thoughts:
I agree that the LI criterion is “pointwise” in the way that you describe, but I think that this is both pretty good and as much as could actually be asked. A single efficiently computable trader can do a lot. It can enforce coherence on a polynomially growing set of sentences, search for proofs using many different proof strategies, enforce a polynomially growing set of statistical patterns, enforce reflection properties on a polynomially large set of sentences, etc. So, eventually the market will not be exploitable on all these things simultaneously, which seems like a pretty good level of accurate beliefs to have.
On the other side of things, it would be far to strong to ask for a uniform bound of the form “for every , there is some day such that after step , no trader can multiply its wealth by a factor more than ”. This is because a trader can be hardcoded with arbitrarily many hard-to-compute facts. For every , there must eventually be a day on which the belief of your logical inductor assign probability less than to some true statement, at which point a trader who has that statement hardcoded can multiply its wealth by . (I can give a construction of such a sentence using self-reference if you want, but it’s also intuitively natural—just pick many mutually exclusive statements with nothing to break the symmetry.)
Thus, I wouldn’t think of traders as “mistakes”, as you do in the post. A trader can gain money on the market if the market doesn’t already know all facts that will be listed by the deductive process, but that is a very high bar. Doing well against finitely many traders is already “pretty good”.
What you can ask for regarding uniformity is for some simple function such that any trader can multiply its wealth by at most a factor . This is basically the idea of the mistake bound model in learning theory; you bound how many mistakes happen rather than when they happen. This would let you say a more than the one-trader properties I mentioned in my first paragraph. In fact, has this propery; is just the initial wealth of the trader. You may therefore want to do something like setting traders’ initial wealths according to some measure of complexity. Admittedly this isn’t made explicit in the paper, but there’s not much additional that needs to be done to think in this way; it’s just the combination of the individual proofs in the paper with the explicit bounds you get from the initial wealths of the traders involved.
I basically agree completely on your last few points. The traders are a model class, not an ensemble method in any substantive way, and it is just confusing to connect them to the papers on ensemble methods that the LI paper references. Also, while I use the idea of logical induction to do research that I hope will be relevant to practical algorithms, it seems unlikely than any practical algorithm will look much like a LI. For one thing, finding fixed points is really hard without some property stronger than continuity, and you’d need a pretty good reason to put it in the inner loop of anything.
Universal Prediction of Selected Bits solves the related question of what happens if the odd bits are adversarial but the even bits copy the preceding odd bits. Basically, the universal semimeasure learns to do the right thing, but the exact sense in which the result is positive is subtle and has to do with the difference between measures and semimeasures. The methods may also be relevant to the questions here, though I don’t see a proof for either question yet.
Yeah, I like tail dependence.
There’s this question of whether for logical uncertainty we should think of it more as trying to “un-update” from a more logically informed perspective rather than trying to use some logical prior that exists at the beginning of time. Maybe you’ve heard such ideas from Scott? I’m not sure if that’s the right perspective, but it’s what I’m alluding to when I say you’re introducing artificial logical uncertainty.
Yeah, the 5 and 10 problem in the post actually can be addressed using provability ideas, in a way that fits in pretty natually with logical induction. The motivation here is to work with decision problems where you can’t prove statements for agent , utility function , action , and utility value , at least not with the amount of computing power provided, but you want to use inductive generalizations instead. That isn’t necessary in this example, so it’s more of an illustration.
To say a bit more, if you make logical inductors propositionally consistent, similarly to what is done in this post, and make them assign things that have been proven already probability 1, then they will work on the 5 and 10 problem in the post.
It would be interesting if there was more of an analogy to explore between the provability oracle setting and the inductive setting, and more ideas could be carried over from modal UDT, but it seems to me that this is a different kind of problem that will require new ideas.
I haven’t read too closely, but it looks like the equivalence relation that you’re talking about in the post sets elements that are scalar multiples of each other in equivalence. This isn’t the point of my equivalence; the stuff I wrote is all in terms of vectors, not directions. My other top-level comment discusses this.