Gems from the Wiki: Paranoid Debating

Link post

Dur­ing the LessWrong 1.0 Wiki Im­port we (the LessWrong team) dis­cov­ered a num­ber of great ar­ti­cles that most of the LessWrong team hadn’t read be­fore. Since we ex­pect many oth­ers to also not have have read these, we are cre­at­ing a se­ries of the best posts from the Wiki to help give those hid­den gems some more time to shine.

Most of the work for this post was done by freyley and Jen­niferRM who I’ve added as coau­thors to this post. Wiki ed­its were also made by all of the fol­low­ing: BJR, PeerIn­finity, Ad­min, Po­ta­toDumplings, Vladimir Nesov, Zack M. Davis, Freyley and Grog­nor. Thank you all for your con­tri­bu­tions!


Para­noid De­bat­ing is a var­i­ant of The Au­mann Game where one player pur­pose­fully sub­verts the group es­ti­mate. Similar to The Au­mann Game, the ac­tivity con­sists of a group jointly pro­duc­ing a con­fi­dence in­ter­val for an un­known, but ver­ifi­able quan­tity, which is then scored for ac­cu­racy and cal­ibra­tion. One in­di­vi­d­ual is des­ig­nated the spokesper­son, who is re­spon­si­ble for choos­ing the fi­nal es­ti­mate. How­ever, be­fore the ac­tivity be­gins, one in­di­vi­d­ual is se­cretly as­signed the role of mis­lead­ing the other mem­bers. The de­ceiver is scored higher the worse the fi­nal es­ti­mate is.The ac­tivity is in­tended to teach ac­cu­rate es­ti­mate, proper agree­ment tech­niques, and recog­ni­tion of de­cep­tion.

A typ­i­cal sub­ject for the game might be “How much maize is pro­duced in Mex­ico an­nu­ally?”.

Rules

  • Select player roles. In per­son, each player re­ceives or se­lects a card from a pack of role cards. For 4 play­ers, cre­ate a pack of role cards by com­bin­ing 3 black cards with 1 red card. For 4-6 play­ers there should be 1 red card and the rest black with the rest be­ing enough for one card per per­son. For 7-9 play­ers, 2 red cards. Some var­i­ants in­clude a role named the Ad­vo­cate, which you can des­ig­nate one of the black cards to rep­re­sent.

Sim­plest variant

  • Each player re­ceives a role. No ad­vo­cate.

  • A ques­tion is asked.

  • Play­ers dis­cuss for 20 min­utes, then write down their in­di­vi­d­ual re­sponse on a card.

  • The an­swer is re­searched.

  • Scores are as­signed.

Ad­vo­cate var­i­ant, #1

  • Each player re­ceives a role. One ad­vo­cate in the deck. The player who re­ceives the Ad­vo­cate dis­plays it to the group.

  • A ques­tion is asked.

  • Play­ers dis­cuss for 20 min­utes, at­tempt­ing to con­vince the ad­vo­cate. The ad­vo­cate writes down their re­sponse on a card. This is the group’s an­swer.

  • The an­swer is re­searched, scores are as­signed.

Ad­vo­cate var­i­ant, #2

  • Each player re­ceives a role. One ad­vo­cate in the deck. No player may dis­play their card.

  • A ques­tion is asked.

  • Play­ers dis­cuss for 20 min­utes. Any­one may say any­thing. At the end, the ad­vo­cate writes down what they think the group’s re­sponse is on a card, and the group is scored for this.

  • An­swer re­searched, scores as­signed.

Vari­a­tion-by-ar­gu­ment variant

  • Each player re­ceives a role. No ad­vo­cate. No player may dis­play their card.

  • A ques­tion is asked.

  • Play­ers have 2-5 min­utes to write down their ini­tial, in­di­vi­d­ual es­ti­mate.

  • Play­ers dis­cuss for 20 min­utes. Any­one may say any­thing. At the end, play­ers write their re­vised es­ti­mates on their card.

  • Play­ers are scored based on their delta—the more you go to­ward the cor­rect an­swer from your ini­tial es­ti­mate, the more points.

South­ern Cal­ifor­nia Var­i­ant #1

At the Fe­bru­ary 2011 South­ern Cal­ifor­nia LW Meetup we tried play­ing the game. For ques­tions we bought a game of Wits & Wagers (which has trivia ques­tions with nu­mer­i­cal an­swers) and looked at the cards to find ques­tions that were about sub­stan­tive top­ics where Fermi es­ti­mates seemed use­ful. The speaker/​ad­vo­cate was cho­sen on a ro­tat­ing ba­sis so that ev­ery­one gets at least one chance to play that role, and cards are dealt from a deck of play­ing cards to ev­ery­one else. Red cards mean you’re try­ing to make the group de­liver a bad an­swer. Black cards mean you’re try­ing to make the group de­liver a good an­swer. This makes the num­ber of peo­ple to be sus­pi­cious of it­self an un­known pa­ram­e­ter and leads to funny out­comes and in­ter­est­ing co­or­di­na­tion prob­lems. Scor­ing used the ex­per­i­men­tal scor­ing code that is in­tended to as­sign the most credit to small er­ror bars around high con­fi­dence cor­rect an­swers.

Questions

It’s re­ally easy to ask a ques­tion that is then very difficult to an­swer later. For ex­am­ple, the ques­tion “How many miles of railroad are there in Africa?” is some­what difficult to an­swer. Walk­ing through the CIA World Fact Book one coun­try at a time, we ar­rived at an an­swer in the range of 48,000-49,000. How­ever, in cross-check­ing that in­for­ma­tion, we dis­cov­ered that in Uganda, there are only 125 miles of ac­tive railroad, but 1200km listed in the Fact Book. It seems likely, there­fore, that the to­tal es­ti­mate in­cludes some non-ac­tive miles of railroad, and is thus too high. This sec­tion is here to list good and bad ques­tions and re­sources to get ques­tions from or an­swer ques­tions un­usu­ally eas­ily. If list­ing an an­swer, please make the text of the an­swer white so peo­ple can use it if they want.

Scoring

A not-so-triv­ial in­con­ve­nience to play­ing the game is figur­ing out how to score it prop­erly.

To make this eas­ier there is now a ten­ta­tive file for­mat for rep­re­sent­ing a game of para­noid de­bate and a python script for scor­ing games rep­re­sented in this for­mat. If you’d like to down­load or edit this soft­ware check out this github pro­ject. Please note that the game for­mat and the code are very likely to evolve to re­move bugs and sup­port what­ever sort of play turns out to be the most fun and/​or ed­u­ca­tional.

Blog posts

See also