Poker example: (not) deducing someone’s preferences

I’ve shown that it is, the­o­ret­i­cally, im­pos­si­ble to de­duce the prefer­ences and ra­tio­nal­ity of an agent by look­ing at their ac­tions or policy.

That ar­gu­ment is valid, but feels some­what ab­stract, talk­ing about “fully anti-ra­tio­nal” agents, and other “ob­vi­ously ridicu­lous” prefer­ences.

In this post, I’ll pre­sent a sim­ple re­al­is­tic ex­am­ple of hu­man be­havi­our where their prefer­ences can­not be de­duced. The ex­am­ple was de­vel­oped by Xavier O’rourke.

The mo­ti­va­tions and be­liefs of a poker player

In this ex­am­ple, Alice is play­ing Bob at poker, and they are on their last round. Alice might be­lieve that Bob has a bet­ter hand, or a worse one. She may be max­imis­ing her ex­pected in­come, or min­imis­ing it (why? read on to see). Even un­der ques­tion­ing, it is im­pos­si­ble to dis­t­in­guish an Alice be­lief in Bob hav­ing a worse hand and Alice fol­low­ing a max­imis­ing be­havi­our, from Bob-bet­ter-hand-and-Alice-min­imis­ing-in­come. And, similarly, Bob-worse-hand-and-Alice-min­imis­ing-in­come is in­dis­t­in­guish­able from Bob-bet­ter-hand-and-Alice-max­imis­ing-in­come.

If we want to be spe­cific, imag­ine the we are ob­serv­ing Alice play­ing a game of Texas hol­dem’. Be­fore the river (the fi­nal round of bet­ting), ev­ery­one has folded be­sides Alice and Bob. Alice is hold­ing , and the board (the five cards both play­ers have in com­mon) is .

Alice is look­ing at four-of-a-kind in 10′s, and can only lose if Bob holds , giv­ing him a straight flush. For sim­plic­ity, as­sume Bob has raised, and Alice can only call or fold—as­sume she’s out of money to re-raise—and Bob can­not re­spond to ei­ther, so his ac­tions are ir­rele­vant. He has been play­ing this hand, so far, with great con­fi­dence.

Alice can have two heuris­tic mod­els of Bob’s hand. In one model, , she as­sumes that hav­ing speci­fi­cally is very low, so she al­most cer­tainly has the bet­ter hand. In a sec­ond model, she notes Bob’s great con­fi­dence, and con­cludes he is quite likely to have that pair.

What does Alice want? Well, one ob­vi­ous goal is to max­imise money, with re­ward , lin­ear in money. How­ever, it’s pos­si­ble that Alice doesn’t care about how much money she’s tak­ing home—she’d pre­fer to take Bob home in­stead, her re­ward is -- and she thinks that putting Bob in a good mood by let­ting him win at poker will make him more re­cep­tive to her ad­vances later in the evening. In this case Alice wants to lose as much money as she can in this hand, so, in this spe­cific situ­a­tion, .

Then the fol­low­ing table rep­re­sent’s Alice’s ac­tion, as a func­tion of her model and re­ward func­tion:

Thus, for ex­am­ple, if she wants to max­imise money () and be­lieves Bob doesn’t have the win­ning hand (), she should call. Similarly, re­sults in Alice call­ing (be­cause she be­lieves she will lose if both play­ers show their cards, and wants to lose). Con­versely, and re­sult in Alice fold­ing.

Thus ob­serv­ing Alice’s be­havi­our nei­ther con­strains her be­liefs, nor her prefer­ences—though it does con­strain the com­bi­na­tion of the two.

Alice’s over­all actions

Can we re­ally not figure what Alice wants here? What about if we just waited to see her pre­vi­ous or sub­se­quent be­havi­our? Or if we sim­ply asked her what she wanted?

Un­for­tu­nately, nei­ther of these may suffice. Even if Alice is mainly a money max­imiser, it’s pos­si­ble she might take Bob as a con­so­la­tion prize; even if she was mainly in­ter­ested in Bob, it’s pos­si­ble that she pre­vi­ously played ag­gres­sively to win money, rea­son­ing that Bob is more likely to savour a fi­nal vic­tory against a wor­thy-seem­ing op­po­nent.

As for ask­ing Alice—well, sex­ual prefer­ences and poker strate­gies are ar­eas where hu­mans are in­cred­ibly mo­ti­vated to lie and mis­lead. Why con­fess to a de­sire that might re­sult in it be­ing im­pos­si­ble to achieve? Or re­veal how you analyse poker hands in an un­duly hon­est way? Con­versely, hon­esty or dou­ble-bluffs are also op­tions.

Thus, it is plau­si­ble that Alice’s to­tal be­havi­our could be iden­ti­cal in the and cases (and in the and cases), not al­low­ing us to dis­t­in­guish these. Or at least, not al­low­ing us to dis­t­in­guish them with much con­fi­dence.

Ad­ding more details

It might be ob­jected that the prob­lem above is overly nar­row, and that if we ex­panded the space of ac­tions, Alice’s prefer­ences would be­come clear.

That is likely to be the case; but the space of be­liefs and re­wards was also nar­row. We could al­low Alice to raise as well (maybe with the goal of trick­ing Bob into fold­ing); with three ac­tions, we may be able to dis­t­in­guish bet­ter be­tween the four pos­si­ble pairs. But we can then give Alice more mod­els as to how Bob would re­act, in­creas­ing the space of pos­si­bil­ities. We could also con­sider more pos­si­ble mo­tives for Alice—she might have a risk averse money-lov­ing util­ity, and/​or some mix be­tween and .

It’s there­fore not clear that “ex­pand­ing” the prob­lem, or mak­ing it more re­al­is­tic, would make it any eas­ier to de­duce what Alice wants.

No nominations.
No reviews.