# A small example of one-step hypotheticals

Just a small ex­am­ple of what one-step hy­po­thet­i­cals might mean in the­ory and in prac­tice.

This in­volves a hu­man H pric­ing some small ob­ject:

## In theory

The hu­man H is (hy­po­thet­i­cally) asked var­i­ous ques­tions that causes it to model how much they would pay for the small vi­o­lin. Th­ese ques­tions are asked at var­i­ous times, and with var­i­ous phras­ings, and the re­sults look like this:

Here the cost­ings are all over the place, and one ob­vi­ous way of rec­on­cil­ing them would be to take the mean (in­di­cated by the large red square), which is around 5.5.

But it turns out there are ex­tra pat­terns in the hy­po­thet­i­cals and the an­swers . For ex­am­ple, there is a clear differ­ence be­tween val­u­a­tions that are done in the morn­ing, around mid­day, or in the evening. And there is a differ­ence if the vi­o­lin is (ac­cu­rately) de­scribed as “hand­made”L

There are now more op­tions for find­ing a “true” val­u­a­tion here. The ob­vi­ous first step would be to over-weight the evening val­u­a­tions, as there are less dat­a­points there (this would bring the av­er­age up a bit). Or one could figure out whether the “true” H was bet­ter rep­re­sented by their morn­ing, mid­day, or evening selves. Or whether their prefer­ence for “hand­made” ob­jects was strong and gen­uine, or a pass­ing pos­i­tive af­fect. H’s var­i­ous meta-prefer­ences would all be highly rele­vant to these choices.

# In practice

Ok, that’s what might hap­pen if the agent had the power to ask un­limited hy­po­thet­i­cal ques­tions in ar­bi­trar­ily many coun­ter­fac­tual sce­nar­ios. But that is not the case in the real world: the agent would be able to ask one, or maybe two ques­tions at most, be­fore the hu­man at­ti­tude to the vi­o­lin would change, and fur­ther data would be­come tainted.

Note that if the agent had a good brain model of H, it might be able to simu­late all the rele­vant an­swers; but we’ll as­sume for the mo­ment that the agent doesn’t have the ca­pa­bil­ities.

So, in the­ory, huge amounts of data and many rele­vant pat­terns that are meta-prefer­en­tially rele­vant. In prac­tice, two val­ues max­i­mum.

Now, if this was all that the agent had ac­cess to, then it could only use a crude guess. But if the agent was in­ves­ti­gat­ing the hu­man more thor­oughly, it could do a lot more. The pat­tern of valu­ing things differ­ently at differ­ent times of the day might show up over longer ob­ser­va­tions, as would the pat­tern of re­act­ing to key words in the de­scrip­tion. If the agent as­sumed that “valu­ing ob­jects” was not some­thing that hu­mans did ex-nihilo with each ob­ject (with each ob­ject hav­ing its own in­de­pen­dent quirky bi­ases), then it could ap­ply the tem­plate across all val­u­a­tions, and from even a sin­gle data point (along with knowl­edge of the time of day, the de­scrip­tion, etc...) come up with an es­ti­mate that was closer to the the­o­ret­i­cal one.