# Introduction

This post aims to for­mal­ise the mod­els and model changes/​model splin­ter­ing de­scribed in this post. As ex­plained there, the idea is to have a meta-model suffi­ciently gen­eral to be able to di­rectly cap­ture the pro­cess of mov­ing from one im­perfect model to an­other.

## A note on infinity

For sim­plic­ity of ex­po­si­tion, I’ll not talk about is­sues of in­finite sets, con­ti­nu­ity, con­ver­gence, etc… Just as­sume that any in­finite set that comes up is just a finite set, large enough for what­ever prac­ti­cal pur­pose we need it for.

# Fea­tures, wor­lds, environments

A model is defined by three ob­ject, the set of fea­tures, the set of en­vi­ron­ments, and a prob­a­bil­ity dis­tri­bu­tion . We’ll define the first two in this sec­tion.

## Features

Fea­tures are things that might be true or not about wor­lds, or might take cer­tain val­ues in wor­lds. For ex­am­ple, “the uni­verse is open” is a pos­si­ble fea­ture about our uni­verse, “the tem­per­a­ture is ” is an­other pos­si­ble fea­ture, but in­stead of re­turn­ing true or false, it re­turns the tem­per­a­ture value. Ad­ding more de­tails, such as “the tem­per­a­ture is in room 3, at 12:01” show that fea­tures should also be able to take in­puts: fea­tures are func­tions.

But what about “the fre­quency of white light”? That’s some­thing that makes sense in many mod­els—white light is used ex­ten­sively in many con­texts, and light has a fre­quency. The prob­lem with that state­ment is that light has mul­ti­ple fre­quen­cies; so we should al­low fea­tures to be, at least in some cases, mul­ti­val­ued func­tions.

To top that off, some­times there will be no cor­rect value for a func­tion; “the height of white light” is some­thing that doesn’t mean any­thing. So fea­tures have to in­clude par­tial func­tions as well.

For­tu­nately, mul­ti­val­ued and par­tial func­tions are even sim­pler than func­tions at the for­mal level: they are just re­la­tions. And since the sets in the re­la­tions can con­sist of a sin­gle el­e­ment, in even more gen­er­al­ity, a fea­ture is a pred­i­cate on a set. We just need to know which set.

So, for­mally, a fea­ture con­sists of an (im­plicit) la­bel defin­ing what is (eg “open uni­verse”, “tem­per­a­ture in some lo­ca­tion”) and a set on which it is a pred­i­cate. Thus, for ex­am­ple, the fea­tures above could be:

1. (fea­tures which are sim­ply true or false are func­tions of a sin­gle el­e­ment).

2. .

3. , for some set of lo­ca­tions and of pos­si­ble times.

4. .

5. for a set of ob­jects.

Note these defi­ni­tions are purely syn­tac­tic, not se­man­tic: they don’t have any mean­ing. In­deed, as sets, and are iden­ti­cal. Note also that there are mul­ti­ple ways of defin­ing the same things; in­stead of a sin­gle fea­ture , we could have a whole col­lec­tion of for all .

## Worlds

In Abram’s or­tho­dox case against util­ity func­tions he talks about the Jeffrey-Bolker ax­ioms, which al­lows the con­struc­tion of prefer­ences from events with­out need­ing full wor­lds at all.

Similarly, this for­mal­ism is not fo­cused on wor­lds, but it can be use­ful to define the full set of wor­lds for a model. This is sim­ply the pos­si­ble val­ues that all fea­tures could con­ceiv­ably take; so, if is the dis­joint union of all fea­tures in (seen as sets), the set of wor­lds is just , the pow­er­set of - equiv­a­lently, the set of all func­tions from to .

So just con­sists of all things that could be con­ceiv­ably dis­t­in­guished by the fea­tures. If we need more dis­crim­i­na­tion than this—just add more fea­tures.

## Environments

The set of en­vi­ron­ments is a sub­set of , the set of wor­lds (though it need not be defined via ; it’s a set of func­tions from to ).

Though this defi­ni­tion is still syn­tac­tic, it starts putting some re­stric­tions on what the se­man­tics could pos­si­bly be, in the spirit of this post.

For ex­am­ple, could re­strict to situ­a­tions where is a sin­gle val­ued func­tion, while is al­lowed to be mul­ti­val­ued. And similarly, takes no defined val­ues on any­thing in the do­main of .

## Probability

The sim­plest way of defin­ing is as a prob­a­bil­ity dis­tri­bu­tion over .

This means that, if and are sub­sets of , we can define the con­di­tional probability

Once we have such a prob­a­bil­ity dis­tri­bu­tion, then, if the set of fea­tures is rich enough, this puts a lot more re­stric­tions on the mean­ing that these fea­tures could have, go­ing a lot of the way to­wards se­man­tics. For ex­am­ple, if cap­tures the ideal gas laws, then there is a spe­cific re­la­tion be­tween tem­per­a­ture, pres­sure, vol­ume, and amount of sub­stance—what­ever those fea­tures are la­bel­led.

In gen­eral, we’d want to be ex­press­ible in a sim­ple way from the set of fea­tures; that’s the point of hav­ing those fea­tures in the first place.

The plan for this meta-for­mal­ism is to al­low tran­si­tion from im­perfect mod­els to other im­perfect mod­els. So re­quiring that they have a prob­a­bil­ity dis­tri­bu­tion over all of may be too much to ask.

In prac­tice, all that is needed is ex­pres­sions of the type . And these may not be needed for all , . For ex­am­ple, to go back to the ideal gas laws, it makes perfect sense that we can de­duce tem­per­a­ture from the other three fea­tures. But what if just fixed the vol­ume—can we de­duce the pres­sure from that?

With as a prior over , we can, by get­ting the pres­sure and amount of sub­stance from the prior. But many mod­els don’t in­clude these pri­ors, and there’s no rea­son to avoid those.

So, in the more gen­eral case, in­stead of , define , so that, for all , the fol­low­ing prob­a­bil­ity is defined:

To in­sure con­sis­tency, we can re­quire to fol­low ax­ioms similar to the two-val­ued prob­a­bil­ities ap­pendix *IV in Pop­per’s “Logic of Scien­tific Dis­cov­ery”.

In full gen­er­al­ity, we might need an even more gen­eral or im­perfect defi­ni­tion of . But I’ll leave this aside for the mo­ment, and as­sume the sim­pler case where is a dis­tri­bu­tion over .

# Refinement

Here we’ll look at how one can im­prove a model. Ob­vi­ously, one can get a bet­ter , or a more ex­pan­sive , or a com­bi­na­tion of these. Now, we haven’t talked much about the qual­ity of , and we’ll leave this un­der­defined. Say that means that is ‘at least as good as ‘. The ‘at least as good’ is speci­fied by some mix of ac­cu­racy and sim­plic­ity.

More ex­pan­sive means that the en­vi­ron­ment of the im­prove­ment can be big­ger. But in or­der for some­thing to be “big­ger”, we need some iden­ti­fi­ca­tion be­tween the two en­vi­ron­ments (which, so far, have just been defined as sub­sets of the pow­er­set of fea­ture val­ues).

So, let and be mod­els, let be a sub­set of , and let be a sur­jec­tive map from to (for an , think of , the preimage of , as the set of all en­vi­ron­ments in that cor­re­spond to ).

We can define on in the fol­low­ing man­ner: if and are sub­sets of , define

Then defines as a re­fine­ment of if:

• .

## Refine­ment examples

Here are some ex­am­ples of differ­ent types of re­fine­ments:

1. -im­prove­ment: , , (eg us­ing the sine of the an­gle rather than the an­gle it­self for re­frac­tion).

2. En­vi­ron­ment ex­ten­sion: , , with the iden­tity, on (eg mov­ing from a train­ing en­vi­ron­ment to a more ex­ten­sive test en­vi­ron­ment).

3. Nat­u­ral ex­ten­sion: en­vi­ron­ment ex­ten­sion where is sim­ply defined in terms of on , and this ex­tends to on (eg ex­tend­ing New­to­nian me­chan­ics from the Earth to the whole of the so­lar sys­tem).

4. Non-in­de­pen­dent fea­ture ex­ten­sion: . Let be the map that takes an el­e­ment of and maps it to by re­strict­ing[1] to fea­tures in . Then on , and (eg adding elec­tro­mag­netism to New­to­nian me­chan­ics).

5. In­de­pen­dent fea­ture ex­ten­sion: as a non-in­de­pen­dent fea­ture ex­ten­sion, but , and the stronger con­di­tion for that for any with (eg non-col­lid­ing planets mod­el­led with­out ro­ta­tion, chang­ing to mod­el­ling them with (mild) ro­ta­tion).

6. Fea­ture re­fine­ment: (mov­ing from the ideal gas mod­els to the van der Waals equa­tion).

7. Fea­ture splin­ter­ing: when there is no sin­gle nat­u­ral pro­jec­tion that ex­tends (eg Blegg and Rube gen­er­al­i­sa­tion, hap­piness and hu­man smile com­ing apart, in­er­tial mass in gen­eral rel­a­tivity pro­jected to New­to­nian me­chan­ics...)

8. Re­ward func­tion splin­ter­ing: no sin­gle nat­u­ral ex­ten­sion of the re­ward func­tion on from to all of (any situ­a­tion where a re­ward func­tion, seen as a fea­ture, splin­ters).

# Re­ward func­tion: re­fac­tor­ing and splintering

## Re­ward func­tion refactoring

Let be a re­fine­ment of (via ), and let be a re­ward func­tion defined on .

A re­fac­tor­ing of on , is a re­ward func­tion on such that for all , . A nat­u­ral re­fac­tor­ing is an ex­ten­sion of is a re­fac­tor­ing that satis­fies some nat­u­ral­ness or sim­plic­ity prop­er­ties. For ex­am­ple, if is the mo­men­tum of an ob­ject in , and if mo­men­tum still makes sense in , then this should be a nat­u­ral re­fac­tor­ing.

## Re­ward func­tion splintering

If there does not ex­ist a unique nat­u­ral re­fac­tor­ing of on , then the re­fine­ment from to splin­ters .

## Fea­ture splintering

Let be the in­di­ca­tor func­tion for a fea­ture be­ing equal to some el­e­ment or in some range. If splin­ters in a re­fine­ment, then so does that fea­ture.

1. Note that is the set of all func­tions from to . Since , . Then we can pro­ject from to by re­strict­ing a func­tion to its val­ues on . ↩︎