Alexander Gietelink Oldenziel comments on Alexander Gietelink Oldenziel’s Shortform

Alexander Gietelink Oldenziel 9 Feb 2024 1:57 UTC
4 points
0
[This is joint thinking with Sam Eisenstat. Also thanks to Caspar Oesterheld for his thoughtful comments. Thanks to Steve Byrnes for pushing me to write this out.]

The Hyena problem in long-term planning
Logical induction is a nice framework to think about bounded reasoning. Very soon after the discovery of logical induction people tried to make logical inductor decision makers work. This is difficult to make work: one of two obstacles is

Obstacle 1: Untaken Actions are not Observable
Caspar Oesterheld brilliantly solved this problem by using auction markets in defining his bounded rational inductive agents.
The BRIA framework is only defined for single-step/ length 1 horizon decisions.
What about the much more difficult question of long-term planning? I’m going to assume you are familiar with the BRIA framework.
Setup: we have a series of decisions D_i, and rewards R_i, i=1,2,3… where rewards R_i can depend on arbitrary past decisions.
We again think of an auction market M of individual decisionmakers/ bidders.
There are a couple design choices to make here:
- bidders directly bet for an action A in a decision D_i or bettors bet for rewards on certain days.
- total observability or partial observability.
- bidders can bid conditional on observations/ past actions or not
- when can the auction be held? i.e. when is an action/ reward signal definitely sold?
To do good long-term planning it should be possible for one of the bidders or a group of bidders to commit to a long-term plan, i.e. a sequence of actions. They don’t want to be outbid in the middle of their plan.
There are some problems with the auction framework: if bids for actions can’t be combined then an outside bidder can screw up the whole plan by making a slighly higher bid for an essential part of the plan. This look like ADHD.
How do we solve this? One way is to allow a bidder or group of bidders to bid for a whole sequence of actions for a single lumpsum.
- One issue is that we also have to determine how the reward gets awarded. For instance the reward could be very delayed. This could be solved by allowing for bidding for a reward signal R_i on a certain day conditional on a series of actions.
There is now an important design choice left. When a bidder $B$ owns a series of actions A=a_1,..,a_k (some of the actions in the future, some already in the past) when there is another bid $X$ from another bidder $C$ on future actions
- is bidder $B$ forced to sell their contract on $A$ to $C$ if the bid is high enough ? [higher than the original bid]
Both versions seem problematic:
- if they don’t have to there is an Incumbency Advantage problem. An initially rich bidder can underbid for very long horizons and use the steady trickle of cash to prevent any other bidders from ever being to underbid any actions.
- Otherwise there is the Hyena problem.
The Hyena Problem
Imagine the following situation: on Day 1 the decisionmaker has a choice of actions. The highest expected value action is action a. If action a is made on Day 2 a fair coin is flipped. On Day 3 the reward is paid out.
If the coin was heads, 15 reward is paid out.
If the coin was tails, 5 reward is paid out.
The expected value is therefore 10. This is higher (by assumption) than the other unnamed actions.
However if the decisionmaker is a long-horizon BRIA with forced sales there is a pathology.
A sensible bidder is willing to pay up to 10 utilons for the contracts on the day 3 reward conditional on action a.
However, with a forced sale mechanism on Day 2 a ‘Hyena bidder’ can come that will ‘attempt to steal the prey’.
The Hyena bidder bids >10 for the contract if the coin comes up heads on Day 2 but doesn’t bid anything for the contract if the coin comes up tails.
This is a problem since the expected value of the action a for the sensible bidder goes down, so the sensible bidder might no longer bid for the action that maximizes expected value for the BRIA. The Hyena bidder screws up the credit allocation.
some thoughts:
- if the sensible bidder is able to make bids conditional on the outcome of the coin flip that prevents Hyena bidder. This is a bit weird though because it would mean that the sensible bidder must carry around lots of extraneous non-necessary information instead of just caring about expected value.
- perhaps this can alleviated by having some sort of ‘neo-cortex’ separate logical induction markets that is incentivized to have accurate beliefs. This is difficult to get right: the prediction market needs to be incentivized to get accurate on beliefs that are actually action relevant, not random beliefs—if the prediction market and the auction market are connected too tightly you might run the risk of getting into the old problems of Logical Inductor Decision makers. [they underexplore since untaken action are not observed].