Dishonest Update Reporting

Link post

Re­lated to: Asym­met­ric Jus­tice, Pri­vacy, Blackmail

Pre­vi­ously (Paul Chris­ti­ano): Epistemic In­cen­tives and Slug­gish Updating

The start­ing con­text here is the prob­lem of what Paul calls slug­gish up­dat­ing. Bob is asked to pre­dict the prob­a­bil­ity of a re­ces­sion this sum­mer. He said 75% in Jan­uary, and how be­lieves 50% in Fe­bru­ary. What to do? Paul sees Bob as think­ing roughly this:

If I stick to my guns with 75%, then I still have a 50-50 chance of look­ing smarter than Alice when a re­ces­sion oc­curs. If I waf­fle and say 50%, then I won’t get any credit even if my ini­tial pre­dic­tion was good. Of course if I stick with 75% now and only go down to 50% later then I’ll get dinged for mak­ing a bad pre­dic­tion right now—but that’s lit­tle worse than what peo­ple will think of me im­me­di­ately if I waf­fle.

Paul con­cludes that this is likely:

Bob’s op­ti­mal strat­egy de­pends on ex­actly how peo­ple are eval­u­at­ing him. If they care ex­clu­sively about eval­u­at­ing his perfor­mance in Jan­uary then he should always stick with his origi­nal guess of 75%. If they care ex­clu­sively about eval­u­at­ing his perfor­mance in Fe­bru­ary then he should go straight to 50%. In the more re­al­is­tic case where they care about both, his op­ti­mal strat­egy is some­where in be­tween. He might up­date to 70% this week.

This re­sults in a pat­tern of “slug­gish” up­dat­ing in a pre­dictable di­rec­tion: once I see Bob ad­just his prob­a­bil­ity from 75% down to 70%, I ex­pect that his “real” es­ti­mate is lower still. In ex­pec­ta­tion, his prob­a­bil­ity is go­ing to keep go­ing down in sub­se­quent months. (Though it’s not a sure thing—the whole point of Bob’s be­hav­ior is to hold out hope that his origi­nal es­ti­mate will turn out to be rea­son­able and he can save face.)

This isn’t ‘slug­gish’ up­dat­ing, of the type we talk about when we dis­cuss the Au­mann Agree­ment The­o­rem and its claim that ra­tio­nal par­ties can’t agree to dis­agree. It’s dishon­est up­date re­port­ing. As Paul says, ex­plic­itly.

I think this kind of slug­gish up­dat­ing is quite com­mon—if I see Bob as­sign 70% prob­a­bil­ity to some­thing and Alice as­sign 50% prob­a­bil­ity, I ex­pect their prob­a­bil­ities to grad­u­ally inch to­wards one an­other rather than mak­ing a big jump. (If Alice and Bob were epistem­i­cally ra­tio­nal and hon­est, their prob­a­bil­ities would im­me­di­ately take big enough jumps that we wouldn’t be able to pre­dict in ad­vance who will end up with the higher num­ber. Need­less to say, this is not what hap­pens!)

Un­for­tu­nately, I think that slug­gish up­dat­ing isn’t even the worst case for hu­mans. It’s quite com­mon for Bob to dou­ble down with his 75%, only chang­ing his mind at the last defen­si­ble mo­ment. This is less eas­ily no­ticed, but is even more epistem­i­cally costly.

When Paul speaks of Bob’s ‘op­ti­mal strat­egy’ he does not in­clude a cost to ly­ing, or a cost to oth­ers get­ting in­ac­cu­rate in­for­ma­tion.

This is a world where all one cares about is how one is eval­u­ated, and ly­ing and de­ceiv­ing oth­ers is free as long as you’re not caught. You’ll get ex­actly what you in­cen­tivize.

What that definitely won’t get you are a lot more than just ac­cu­rate prob­a­bil­ity es­ti­mates.

The only way to get ac­cu­rate prob­a­bil­ity es­ti­mates from Bob-who-is-happy-to-strate­gi­cally-lie is to use a math­e­mat­i­cal for­mula to re­ward Bob based on his log like­li­hood score. Or to have Bob bet in a pre­dic­tion mar­ket, or an­other similar ro­bust method. And then use that as the en­tirety of how one eval­u­ates Bob. If hu­man judg­ment is al­lowed in the pro­cess, the value of that will over­whelm any de­sire on Bob’s part to be pre­cise or prop­erly up­date.

Since Bob is al­most cer­tainly in a hu­man con­text where hu­mans are eval­u­at­ing him based on hu­man judg­ments, that means all is mostly lost.

As Paul notes, con­sis­tency is cru­cial in how one is eval­u­ated. Even big­ger is avoid­ing mis­takes.

Given the asym­met­ric jus­tice of pun­ish­ing mis­takes and in­con­sis­tency that can be proven and iden­ti­fied, the strate­gic ac­tor must seek cog­ni­tive pri­vacy. The more oth­ers know about the path of your be­liefs, the eas­ier it will be for them to spot an in­con­sis­tency or a mis­take. It’s hard enough to give a rea­son­able an­swer once, but up­dat­ing in a way that never can be shown to have ever made a mis­take or been in­con­stant? Im­pos­si­ble.

A mis­take or in­con­sis­tency are the bad things one must avoid get­ting docked points for.

Thus, Bob’s full strat­egy, in ad­di­tion to choos­ing prob­a­bil­ities that sound best and give the best cost/​benefit pay­offs in hu­man in­tu­itive eval­u­a­tions of perfor­mance, is to avoid mak­ing any clear state­ments of any kind. When he must do so, he will do his best to be able to deny hav­ing done so. Bob will seek to de­stroy the his­tor­i­cal record of his pre­dic­tions and state­ments, and their path. And also pre­vent the cre­ation of any com­mon knowl­edge, at all. Any knowl­edge of the past situ­a­tion, or the pre­sent out­come, could be shown to not be con­sis­tent with what Bob said, or what we be­lieve Bob said, or what we think Bob im­plied. And so on.

Bob’s op­ti­mal strat­egy is full anti-episte­mol­ogy. He is op­posed to knowl­edge.

In that con­text, Paul’s sug­gested solu­tions seem highly un­likely to work.

His first sug­ges­tion is to ex­clude in­for­ma­tion – to judge Bob only by the ag­gre­ga­tion of all of Bob’s pre­dic­tions, and ig­nore any changes. Not only does this throw away vi­tal in­for­ma­tion, it also isn’t re­al­is­tic. Even if it was re­al­is­tic for some peo­ple, oth­ers would still pun­ish Bob for up­dat­ing.

Paul’s sec­ond sug­ges­tion is to make pre­dic­tions about oth­ers’ be­lief changes, which he him­self notes ‘liter­ally wouldn’t work.’ And that it is ‘a recipe for epistemic catas­tro­phe.’ The whole thing is con­voluted and un­nat­u­ral at best.

Paul’s third and fi­nal sug­ges­tion is so­cial dis­ap­proval of slug­gish up­dat­ing. As he notes, this twists so­cial in­cen­tives po­ten­tially in good ways but likely in ways that make things worse:

Hav­ing no­ticed that slug­gish up­dat­ing is a thing, it’s tempt­ing to re­spond by just pe­nal­iz­ing peo­ple when they seem to up­date slug­gishly. I think that’s a prob­le­matic re­sponse:

  • I think the ra­tio­nal re­ac­tion to norms against slug­gish up­dat­ing may of­ten be no up­dat­ing at all, which is much worse.

  • In gen­eral com­bat­ing non-epistemic in­cen­tives with other non-epistemic in­cen­tives seems like dig­ging your­self into a hole, and can only work if you bal­ance ev­ery­thing perfectly. It feels much safer to just try to re­move the non-epistemic in­cen­tives that were caus­ing the prob­lem in the first place.

  • Slug­gish up­dat­ing isn’t easy to de­tect in any given case. For ex­am­ple, sup­pose that Bob ex­pects an event to hap­pen, and if it does he ex­pects to get a pos­i­tive sign on any given day with 1% prob­a­bil­ity. Then if the event doesn’t hap­pen his prob­a­bil­ity will de­cay ex­po­nen­tially to­wards zero, fal­ling in half ev­ery ~70 days. This will look like slug­gish up­dat­ing.

Bob already isn’t ex­cited about up­dat­ing. He’d pre­fer to not up­date at all. He’s up­set about hav­ing had to give that 75% an­swer, be­cause now if there’s new in­for­ma­tion (in­clud­ing oth­ers’ opinions) he can’t keep say­ing ‘prob­a­bly’ and has to give a new num­ber, again giv­ing oth­ers in­for­ma­tion to use as am­mu­ni­tion against him.

The rea­son he up­dated visi­bly, at all, was that not up­dat­ing would have been in­con­sis­tent or oth­er­wise pun­ished. Pu­n­ish up­dates for be­ing too small on top of already look­ing bad for chang­ing at all, and the chance you get the in­cen­tives right here are al­most zero. Bob will game the sys­tem, one way or an­other. And now, you won’t know how Bob is do­ing it. Be­fore, you could know that Bob mov­ing from 75% to 70% meant go­ing to some­thing lower, per­haps 50%. Pre­dictable bad cal­ibra­tion is much eas­ier to fix. Twist things into knots and there’s no way to tell.

Mean­while, Bob is go­ing to re­li­ably get eval­u­ated as smarter and more ca­pa­ble than Alice, who for rea­sons of prin­ci­ple is go­ing around re­port­ing her prob­a­bil­ity es­ti­mates ac­cu­rately. Those ob­serv­ing might even pun­ish Alice fur­ther, as some­one who does not know how the game is played, and would be a poor ally.

The best we can do, un­der such cir­cum­stances, if we want in­sight from Bob, is to do our best to make Bob be­lieve we will re­ward him for up­dat­ing cor­rectly and re­port­ing that up­date hon­estly, then con­sider Bob’s in­cen­tives, bi­ases and in­stincts, and at­tempt as best we can to back out what Bob ac­tu­ally be­lieves.

As Paul notes, we can try to com­bat non-epistemic in­cen­tives with equal and op­po­site other non-epistemic in­cen­tives, but go­ing deep on that gen­er­ally only makes things more com­plex and re­wards more at­ten­tion to our pro­ce­dures and how to trick us, giv­ing Bob an even big­ger ad­van­tage over Alice.

A last-ditch effort would be to give Bob suffi­cient skin in the game. If Bob di­rectly benefits enough from us hav­ing ac­cu­rate mod­els, Bob might re­port more ac­cu­rately. But out­side of very small groups, there isn’t enough skin in the game to go around. And that still as­sumes Bob thinks the way for the group to suc­ceed is to be hon­est and cre­ate ac­cu­rate maps. Whereas most peo­ple like Bob do not think that is how win­ners be­have. Cer­tainly not with vague things that don’t have di­rect phys­i­cal con­se­quences, like prob­a­bil­ity es­ti­mates.

What can be done about this?

Un­less we care enough, very lit­tle. We lost early. We lost on the meta level. We didn’t Play in Hard Mode.

We ac­cepted that Bob was op­ti­miz­ing for how Bob was eval­u­ated, rather than Bob op­ti­miz­ing for ac­cu­racy. But we didn’t eval­u­ate Bob on that ba­sis. We didn’t place the virtues of hon­esty and truth-seek­ing above the virtue of look­ing good suffi­ciently to make Bob’s ‘look good’ pro­ce­dure evolve into ‘be hon­est and seek truth.’ We didn’t work to in­still epistemic virtues in Bob, or se­lect for Bobs with or seek­ing those virtues.

We didn’t re­form the lo­cal cul­ture.

And we didn’t fire Bob the mo­ment we no­ticed.

Game over.

I once worked for a fi­nan­cial firm that made this pri­or­ity clear. On the very first day. You need to always be ready to ex­plain and work to im­prove your rea­son­ing. If we catch you ly­ing, about any­thing at all, ever, in­clud­ing a prob­a­bil­ity es­ti­mate, that’s it. You’re fired. Pe­riod.

It didn’t solve all our prob­lems. More sub­tle dis­tor­tionary dy­nam­ics re­mained, and some evolved as re­ac­tions to the lo­cal virtues, as they always do. For these and other rea­sons, that I will not be get­ting into here or in the com­ments, it ended up not be­ing a good place for me. Those top­ics are for an­other day.

But they sure as hell didn’t have to worry about the likes of Bob.