# Prisoner’s Dilemma (with visible source code) Tournament

After the iter­ated pris­oner’s dilemma tour­na­ment or­ga­nized by prase two years ago, there was dis­cus­sion of run­ning tour­na­ments for sev­eral var­i­ants, in­clud­ing one in which two play­ers sub­mit pro­grams, each of which are given the source code of the other player’s pro­gram, and out­puts ei­ther “co­op­er­ate” or “defect”. How­ever, as far as I know, no such tour­na­ment has been run un­til now.

Here’s how it’s go­ing to work: Each player will sub­mit a file con­tain­ing a sin­gle Scheme lambda-func­tion. The func­tion should take one in­put. Your pro­gram will play ex­actly one round against each other pro­gram sub­mit­ted (not in­clud­ing it­self). In each round, two pro­grams will be run, each given the source code of the other as in­put, and will be ex­pected to re­turn ei­ther of the sym­bols “C” or “D” (for “co­op­er­ate” and “defect”, re­spec­tively). The pro­grams will re­ceive points based on the fol­low­ing pay­off ma­trix:

$\begin{array}{cccc} & C & D & other\\ C & (2,\,2) & (0,\,3) & (0,\,2)\\ D & (3,\,0) & (1,\,1) & (1,\,0)\\ other & (2,\,0) & (0,\,1) & (0,\,0) \end{array}$

“Other” in­cludes any re­sult other than re­turn­ing “C” or “D”, in­clud­ing failing to ter­mi­nate, throw­ing an ex­cep­tion, and even re­turn­ing the string “Co­op­er­ate”. No­tice that “Other” re­sults in a worst-of-both-wor­lds sce­nario where you get the same pay­off as you would have if you co­op­er­ated, but the other player gets the same pay­off as if you had defected. This is an at­tempt to en­sure that no one ever has in­cen­tive for their pro­gram to fail to run prop­erly, or to trick an­other pro­gram into do­ing so.

Your score is the sum of the num­ber of points you earn in each round. The player with the high­est score wins the tour­na­ment. Edit: There is a 0.5 bit­coin prize be­ing offered for the win­ner. Thanks, Vin­cen­tYu!

De­tails:
All sub­mis­sions must be emailed to war­denPD@gmail.com by July 5, at noon PDT (Edit: that’s 19:00 UTC). Your email should also say how you would like to be iden­ti­fied when I an­nounce the tour­na­ment re­sults.
Each pro­gram will be al­lowed to run for 10 sec­onds. If it has not re­turned ei­ther “C” or “D” by then, it will be stopped, and treated as re­turn­ing “Other”. For con­sis­tency, I will have Scheme col­lect garbage right be­fore each run.
One sub­mis­sion per per­son or team. No per­son may con­tribute to more than one en­try. Edit: This also means no copy­ing from each oth­ers’ source code. De­scribing the be­hav­ior of your pro­gram to oth­ers is okay.
I will be run­ning the sub­mis­sions in Racket. You may be in­ter­ested in how Racket han­dles time (es­pe­cially the (cur­rent-mil­lisec­onds) func­tion), threads (in par­tic­u­lar, “thread”, “kill-thread”, “sleep”, and “thread-dead?”), and pos­si­bly ran­dom­ness.
Don’t try to open the file you wrote your pro­gram in (or any other file, for that mat­ter). I’ll add code to the file be­fore run­ning it, so if you want your pro­gram to use a copy of your source code, you will need to use a quine. Edit: No I/​O of any sort.
Un­less you tell me oth­er­wise, I as­sume I have per­mis­sion to pub­lish your code af­ter the con­test.
You are en­couraged to dis­cuss strate­gies for achiev­ing mu­tual co­op­er­a­tion in the com­ments thread.
I’m hop­ing to get as many en­tries as pos­si­ble. If you know some­one who might be in­ter­ested in this, please tell them.
It’s pos­si­ble that I’ve said some­thing stupid that I’ll have to change or clar­ify, so you might want to come back to this page again oc­ca­sion­ally to look for changes to the rules. Any ed­its will be bolded, and I’ll try not to change any­thing too dras­ti­cally, or make any ed­its late in the con­test.

Here is an ex­am­ple of a cor­rect en­try, which co­op­er­ates with you if and only if you would co­op­er­ate with a pro­gram that always co­op­er­ates (ac­tu­ally, if and only if you would co­op­er­ate with one par­tic­u­lar pro­gram that always co­op­er­ates):

(lambda (x)
(if (eq? ((eval x) ’(lambda (y) ’C)) ’C)
’C
’D))

• Here is an idea, but I still have some tech­ni­cal prob­lems with im­ple­men­ta­tion: Con­struct a self-im­prov­ing in­tel­li­gent al­gorithm that will es­cape from the box, steal the ad­minis­tra­tor’s pass­word, re­place the com­pet­ing al­gorithms with Co­op­er­ateBots, then defect against them. Could some­one please help me with the Scheme syn­tax? Or just post the whole pro­gram with com­ments, and I will try to un­der­stand how it works. Thanks!!

• That would vi­o­late the ban on file I/​O in the rules.

• Only open­ing files is against the rules. Noth­ing was speci­fied about delet­ing and writ­ing new files.

Although I don’t see a Scheme ex­am­ple of a self-im­prov­ing in­tel­li­gent al­gorithm on Stack­Ex­change, so it must not be sup­ported by your op­er­at­ing sys­tem.

• I had planned on writ­ing an al­gorithm that would sim­ply per­suade the ad­minis­tra­tor to de­clare it vic­to­ri­ous with­out even run­ning the con­test.

• Or some­what more re­al­is­ti­cally:

Is there a way in Scheme for a func­tion to de­tect whether it is be­ing run by an (eval x) form? Or use a macro to do some­thing like change all the ‘Ds to ’Cs in the par­ent func­tion?

If so, then an amus­ing case would be a PoltergeistBot, which only ever Defects, but if an­other bot tries to eval­u­ate it, it “pos­sesses” the func­tion and forces it to Co­op­er­ate.

• (lambda (x)
(if (eq? (eval ‘\’))))))))))) (in­jectedai … ))));

• That page is a fas­ci­nat­ing read.

• The quine re­quire­ment seems to me to in­tro­duce non-pro­duc­tive com­plex­ity. If file read­ing is dis­al­lowed, why not just pass the pro­gram its own source code as well as its op­po­nent’s?

• That’s a good point. I’ve already got a few sub­mis­sions, but on the other hand, I could no­tify them of the change, and it would only re­quire a triv­ial mod­ifi­ca­tion. Is there a con­sen­sus on whether I should do this any­way?

• For the record, though I raised an ob­jec­tion, I’d be perfectly happy if the con­test were mod­ified so that player pro­grams were passed their sources code as an ar­gu­ment. The rule change would have con­se­quences that I don’t un­der­stand, and I like that. Caveat emp­tor — I sus­pect the rule change would cause peo­ple to write more ex­ploitable pro­grams.

• Pass­ing in the source code is not the same as quin­ing. A pro­gram that is passed in its own source code can eas­ily check to see it’s been al­tered (e.g. by in­clud­ing a cryp­to­graphic sig­na­ture in the source code). With quin­ing, the pro­gram can be mu­tated with­out easy de­tec­tion.

• How does that help? A quine-like pro­gram could just as well put its real pay­load in a string with a cryp­to­graphic sig­na­ture, ver­ify the sig­na­ture, and then eval the string with the string as in­put; thus em­u­lat­ing the “passed its own source­code” for­mat. You could mess with that if you’re smart enough to lo­cate and delete the “ver­ify the sig­na­ture” step, but then you could do that in the real “passed its own source­code” for­mat too.

Con­versely, even if the tour­na­ment pro­gram it­self is hon­est, con­tes­tants can lie to their simu­la­tions of their op­po­nents about what source­code the simu­la­tion is of.

• A quine-like pro­gram could just as well put its real pay­load in a string with a cryp­to­graphic sig­na­ture, ver­ify the sig­na­ture, and then eval the string with the string as in­put; thus em­u­lat­ing the “passed its own source­code” for­mat.

Al­ter­ing the in­ter­nal struc­ture of an op­po­nent pro­gram would be very difficult, but that’s not the only way to mu­tate a pro­gram. You can’t tin­ker with the in­sides of a black box, but you can wrap a black box.

To be con­crete: given an op­po­nent’s source code, I could me­chan­i­cally gen­er­ate an equiv­a­lent pro­gram with ex­tremely dis­similar source code (per­haps just a block of text, a de­cryp­tion rou­tine, and a call to eval) that nev­er­the­less acts ex­actly like the origi­nal pro­gram in ev­ery way. And since that me­chan­i­cally-obfus­cated pro­gram would act ex­actly like the origi­nal pro­gram in ev­ery way, the obfus­cated pro­gram would not be able to de­tect that it had been al­tered. Do you agree?

• I’m play­ing Pri­soner’s Dilemma and wish to test if an op­po­nent X is hon­est. I might try the fol­low­ing:

(1) Create two pro­grams, Y and Z, which are al­gorith­mi­cally equiv­a­lent but obfus­cated ver­sions of X. (2) Run Y and Z against each other.

If Y and Z don’t co­op­er­ate with each other, that’s a good in­di­ca­tion that X rec­og­nizes it­self with a source-code com­par­i­son and that I shouldn’t trust X.

This hon­esty check doesn’t work if Y and Z are given ac­cess to their sources. Sure, when I simu­late Y against Z, I could lie to Y and tell Y that its source is X (so Y be­lieves it­self to be un­mod­ified). But when my de­luded Y simu­la­tion is de­cid­ing whether to co­op­er­ate with Z, it (Y) may run Z in simu­la­tion. If Y in­forms its Z-simu­la­tion that Z’s source is Z, then that Z-simu­la­tion will not be de­luded into think­ing that it is un­mod­ified. Y’s simu­la­tion of Z will be able to de­tect that it is an (obfus­cated) simu­la­tion and act ac­cord­ingly.

This hon­esty check isn’t fool proof. X can rec­og­nize it­self with a more com­pli­cated hand­shake — one that sur­vives code obfus­ca­tion. But if X rec­og­nizes it­self with a more com­pli­cated hand­shake, then X doesn’t need to know its own source code (and we shouldn’t bother pass­ing the source code in).

• I had in mind an au­to­mated wrap­per gen­er­a­tor for the “passed own source­code” ver­sion of the con­test:

(define CliqueBot
(lambda (self op­po­nent)
(if (eq? self op­po­nent) ‘C ’D)))
(define Wrap­per
(lambda (agent)
(lambda (self op­po­nent)
(agent agent op­po­nent))))
(define WrappedCliqueBot
(Wrap­per CliqueBot))


Note that for all val­ues of X and Y, (WrappedCliqueBot X Y) == (CliqueBot CliqueBot Y), and there’s no pos­si­ble code you could add to CliqueBot that would break this iden­tity. Now I just re­al­ized that the very fact that WrappedCliqueBot doesn’t de­pend on its “self” ar­gu­ment, pro­vides a way to dis­t­in­guish it from the un­mod­ified CliqueBot us­ing only black­box queries, so in that sense it’s not quite func­tion­ally iden­ti­cal. Otoh, if you con­sider it un­fair to dis­crim­i­nate against agents just be­cause they use old-fash­ioned quine-type self-refer­ence rather than ex­ploit­ing the con­ve­nience of a “self” ar­gu­ment, then this trans­for­ma­tion is fair.

• Thanks for point­ing that out. Un­less some­one can con­vince me that this won’t be a prob­lem, I will not change the rule.

• Is this rele­vant for the con­test?

• You might want to see if a pro­gram would co­op­er­ate with an obfus­cated ver­sion of it­self (with­out the obfus­cated ver­sion be­ing able to de­tect that it was obfus­cated).

• Yes—in my ver­sion of this you do get passed your own source code as a con­ve­nience.

• In­ject­ing some key­words: this field of study is known as pro­gram equil­ibrium. Pre­vi­ous LW post on the sub­ject, with links.

Edit: Can you ex­plain how you de­cided on the parts of the pay­off ma­trix in­volv­ing “other”? Th­ese seem quite im­por­tant as they af­fect the vi­a­bil­ity of strate­gies based on ei­ther con­vinc­ing your op­po­nent not to halt or not halt­ing your­self.

• Handy in­tro­duc­tory ar­ti­cle: Com­pu­ta­tion and the Pri­soner’s Dilemma.

• If HisPro­gram == MyPro­gram then
do (C);
else
do (D);
end.

TIL that eth­nic ha­tred and trib­al­ism is a Nash (folk) equil­ibrium.

• Make sure the equal­ity com­par­i­son only de­pends on things that af­fect func­tion­al­ity—i.e. it will de­clare any func­tion­ally equiv­a­lent pro­grams equal even they use a differ­ent name­set for vari­ables or some­thing.

(Yes, I know that’s re­ducible to the halt­ing prob­lem; in prac­tice, you’ll use a com­putable, polyno­mial time ap­prox­i­ma­tion for it that will in­evitably have to throw out equiv­a­lent pro­grams that are too com­plex or oth­er­wise be too ‘clever’.)

• Pa­trick dis­cusses this is­sue here in some depth.

• It’s quite likely that the op­ti­mal be­havi­our should be differ­ent in case the pro­gram is func­tion­ally equal but not ex­actly equal.

If you’re play­ing your­self, then you want to co­op­er­ate.

If you’re play­ing some­one else, then you’d want to co­op­er­ate if and only if that some­one else is smart enough to check if you’ll co­op­er­ate; but if it’s de­ci­sion doesn’t de­pend on yours, then you should defect.

• You only need to eval­u­ate the equiv­alence of the first two lines of the pro­gram, by the way. It co­op­er­ates with those who can’t not co­op­er­ate with it if it goes through that branch of the logic, and does some­thing else to ev­ery­one else.

• Can you write that in Scheme so I can sub­mit this? Thanks

• The pay­offs for “other” were de­signed so that nei­ther failing to halt, nor con­vinc­ing the other player not to halt, should ever be a worth­while strat­egy. If you don’t halt, it gives you the same pay­off as if you had co­op­er­ated, and gives the other player the same pay­off as if you had defected. That way, not halt­ing should be strictly dom­i­nated by defect­ing, since you are bet­ter off if you defect, and the other player should re­act the same way to each threat. And trick­ing the other player into not halt­ing is also a bad idea, since the pay­off you get from it is the same as if they defected.

• And trick­ing the other player into not halt­ing is also a bad idea, since the pay­off you get from it is the same as if they defected.

(Defect, non-halt) is ac­tu­ally bet­ter than (Defect, Defect) for you, since it gives you a rel­a­tive ad­van­tage over com­peti­tors in the tour­na­ment.

• True, but I still don’t ex­pect it will be a big prob­lem. If there are a lot of sub­mis­sions, the effect will be small, and if it is pay­ing enough at­ten­tion to your source code for it to be pos­si­ble to trick it into not halt­ing, then it is prob­a­bly look­ing for a way to achieve mu­tual co­op­er­a­tion, so trick­ing it is still not a good strat­egy.

• If you trust all suffi­ciently smart play­ers to try to in­duce (Defect, non-halt) if (Defect, Defect) is oth­er­wise in­evitable, the effect adds up over a hope­fully sig­nifi­cant por­tion of the sub­mis­sions.

• The pay­offs for “other” were de­signed so that nei­ther failing to halt, nor con­vinc­ing the other player not to halt, should ever be a worth­while strat­egy. If you don’t halt, it gives you the same pay­off as if you had co­op­er­ated, and gives the other player the same pay­off as if you had defected.

Your game world im­ple­ments an “en­thu­si­as­tic con­sent” policy

• So as not to du­pli­cate efforts: I have emailed Moshe Ten­nen­holtz and Michael Wooldridge with in­vi­ta­tions to play.

• I’ll post this here, be­cause it’s some­what re­lated. I’ve asked a few peo­ple that two box in New­comb’s prob­lem what they would do in a vari­a­tion of the prob­lem. In­stead of de­cid­ing to one box or two box, you are asked to write a pro­gram that will one box or two box, and Omega is given ac­cess to the source code be­fore filling the boxes. When phrased like this, the few two box­ers I asked said they would write the pro­gram to one box.

• I’ll post this here, be­cause it’s some­what re­lated. I’ve asked a few peo­ple that two box in New­comb’s prob­lem what they would do in a vari­a­tion of the prob­lem. In­stead of de­cid­ing to one box or two box, you are asked to write a pro­gram that will one box or two box, and Omega is given ac­cess to the source code be­fore filling the boxes. When phrased like this, the few two box­ers I asked said they would write the pro­gram to one box.

Note that as­sum­ing that the peo­ple in ques­tion are em­u­lat­ing CDT agents this is pre­cisely what we would ex­pect (and CDT is the most plau­si­ble hy­poth­e­sis for the kind of de­ci­sion al­gorithm they are em­u­lat­ing). Similarly, if a (sane) CDT agent ex­ists at time T with the ca­pa­bil­ity to (rel­a­tively cheaply) self mod­ify it will im­me­di­ately al­ter it­self into “CDT++”. Roughly speak­ing it will end up as an agent that op­er­ates similarly to TDT for the pur­pose of in­fluenc­ing de­ci­sions that oc­cur af­ter time T but similarly to CDT for the pur­pose of in­fluenc­ing de­ci­sions that oc­cur be­fore time T. CDT is not in a sta­ble equil­ibrium ac­cord­ing to its own de­ci­sion al­gorithm!

• Similarly, if a (sane) CDT agent ex­ists at time T with the ca­pa­bil­ity to (rel­a­tively cheaply) self mod­ify it will im­me­di­ately al­ter it­self into “CDT++”.

I still don’t think this is a good de­scrip­tion of the change; the differ­ence be­tween CDT, CDT++, and TDT are on the level of causal mod­els of re­al­ity, not how to make de­ci­sions once pre­sented with a causal model of re­al­ity.

• I would like to de­clare the fol­low­ing: I have sub­mit­ted a pro­gram that co­op­er­ates only if you would co­op­er­ate against Co­op­er­ateBot. You can of course cre­ate a se­lec­tive defec­tor against it, but that would be a bit tricky, as I am not re­veal­ing the source code. Eval­u­ate your sub­mis­sions ac­cord­ingly.

• What wub­bles ac­tu­ally sub­mit­ted: A pro­gram that defects only if you would co­op­er­ate against Co­op­er­ateBot.

Well played.

• But surely, good sir, com­mon sense says that you should defect against Co­op­er­ateBot in or­der to pun­ish it for co­op­er­at­ing with Defec­tBot.

Also, in modal com­bat your bot is X=[]Y(Co­op­er­ateBot) and is eas­ily de­tected via the test [1](Y(X)<->[]X(Co­op­er­ateBot)) where [1] de­notes prov­abil­ity in T+Con(T), ie [1](Q) = []((~[]F)->Q).

• Am I miss­ing some­thing, or does your de­tec­tor use quan­tifi­ca­tion, which is dis­al­lowed in Pa­trick’s modal agents?

• Hm. I think X within the test could be in­tro­duced as a new con­stant and solved, but I’m not sure.

• The agent defined by wub­bles is ac­tu­ally the agent called JustBot in the ro­bust co­op­er­a­tion pa­per and which is proven to be non-ex­ploitable by modal agents.

• Jus­ticeBot co­op­er­ates with any­one who co­op­er­ates with FairBot, and is ex­ploitable by any agent which com­pre­hends source code well enough to co­op­er­ate with FairBot and defect against Jus­ticeBot. Though I’m go­ing here off the re­mem­bered work­shop rather than recheck­ing the pa­per.

• You’re right, and wub­bles’s agent can eas­ily be ex­ploited by a modal agent A defined by A(X)=C <-> [] (X(PB)=C) (where PB is Pru­den­tBot).

• But surely, good sir, com­mon sense says that you should defect against Co­op­er­ateBot in or­der to pun­ish it for co­op­er­at­ing with Defec­tBot.

I thought you said peo­ple should tol­er­ate tol­er­ance :)

• No per­son may con­tribute to more than one en­try.

This is im­por­tant, be­cause oth­er­wise the con­test de­volves into who can sub­mit the most copies of Ab­solutismBot (co­op­er­ate with pro­grams that share it’s source code, oth­er­wise defect)

I think that any sub­mit­ted pro­gram can be im­proved by com­bin­ing it with Ab­solutismBot. If you’re play­ing with some­one who sub­mit­ted the same pro­gram for you, co­op­er­ate (they can’t defect against you in this sce­nario, since they’re run­ning iden­ti­cal source code). If they aren’t run­ning the same code, run what­ever pro­gram lies un­der­neath it.

I think this could get gen­er­al­ized to co­op­er­a­tion with ev­ery­one who has the Ab­solutismBot “wrap­per”, since it doesn’t mat­ter what the code af­ter the Ab­solutismBot sec­tion does. In English (since I don’t know how to pro­gram in Scheme), the pro­gram would be like this:

If the first 117 char­ac­ters of the other pro­gram are the same as the first 117 char­ac­ters of this pro­gram, co­op­er­ate. Other­wise, do some other strat­egy.

All play­ers that im­ple­ment this strat­egy will co­op­er­ate with each other. Granted, this doesn’t help them win the tour­na­ment since it isn’t a rel­a­tive ad­van­tage com­pared to other Ab­solutismBots—it just makes ev­ery­one who doesn’t do this lose the tour­na­ment.

• Ex­cept that over some thresh­old, any Anti-Ab­solutism bots (which have some way of “es­cap­ing” while still con­tain­ing the same first 117 char­ac­ters, like hav­ing a C pre­pro­ces­sor di­rec­tive that re­defines TRUE to equal FALSE) would nec­es­sar­ily be su­pe­rior.

• In­ter­est­ing. Really, what you want (in slightly more gen­er­al­ity) is to co­op­er­ate with any­one who can prove they will co­op­er­ate if you your­self can prove you will co­op­er­ate un­der the same con­di­tion.

That is, if from their source, you can prove “they co­op­er­ate if they can prove this con­di­tion about me”, then you co­op­er­ate.

Of course, it’s not gen­er­ally pos­si­ble to prove things about a pro­gram given its source, es­pe­cially at this level of self-refer­ence. In this par­tic­u­lar case the “gen­er­ally” in there is im­por­tant. It is in your in­ter­est for other pro­grams to be able to prove this prop­erty about you. This is be­yond the scope of the tour­na­ment, ob­vi­ously, but given ex­tremely so­phis­ti­cated play­ers, ev­ery player benefits from adding this prop­erty (if pos­si­ble). You might even want to in­clude a sub­pro­gram which pro­duces a proof!

• Let me try to clear that up.

Define the “Rea­son­able” prop­erty re­flex­ively: a pro­gram is “Rea­son­able” if it prov­ably co­op­er­ates with any pro­gram it can prove is Rea­son­able.

It is in the in­ter­est of ev­ery pro­gram to be Rea­son­able*. In fact, it is in the in­ter­est of ev­ery pro­gram both to be Rea­son­able and to ex­hibit a proof of its own Rea­son­able­ness. (You might even define that into the Rea­son­able prop­erty: don’t just re­quire prov­able (con­di­tional) co­op­er­a­tion, but re­quire the ex­hi­bi­tion of a proof of con­di­tional co­op­er­a­tion.)

Po­ten­tially you might also ex­pand the defi­ni­tion to re­quire effi­cient proofs—say, at most a thou­sand in­struc­tions.

On the other hand, if you’re play­ing with aliens, there’s not nec­es­sar­ily go­ing to be a way you can clearly es­tab­lish which part of your source is the proof of your Rea­son­able­ness. So you want it to be as easy as pos­si­ble for some­one else to prove that you are Rea­son­able.

I’ll hap­pily ex­pand /​ re­word this if it’s at all un­clear.

*Oh—this is maybe un­true. If you are re­ally good at get­ting other pro­grams to co­op­er­ate and then defect­ing, you are bet­ter served by not be­ing rea­son­able.

• Define the “Rea­son­able” prop­erty re­flex­ively: a pro­gram is “Rea­son­able” if it prov­ably co­op­er­ates with any pro­gram it can prove is Rea­son­able.

I’m not sure your defi­ni­tion defines a unique “rea­son­able” sub­set of pro­grams. There are many differ­ent cliques of mu­tu­ally co­op­er­at­ing pro­grams. For ex­am­ple, you could say the “rea­son­able” sub­set con­sists of one pro­gram that co­op­er­ates only with ex­act copies of it­self, and that would be con­sis­tent with your defi­ni­tion, un­less I’m miss­ing some­thing.

• Point. Not sure how to fix that.

Maybe defined the Rea­son­able’ set of pro­grams to be the max­i­mal Rea­son­able set? That is, a set is Rea­son­able if it has the prop­erty as de­scribed, then take the max­i­mal such set to be the Rea­son­able’ set (I’m pretty sure this is guaran­teed to ex­ist by Zorn’s Lemma, but it’s been a while...)

• Zorn’s lemma doesn’t give you unique­ness ei­ther. Also, max­i­mal un­der which par­tial or­der? If you mean max­i­mal un­der in­clu­sion, then my one-el­e­ment set seems to be already max­i­mal :-)

• Clar­ify what you mean by the var­i­ous con­ju­ga­tions of proof: Do you mean “There ex­ists a valid proof in PA with a Godel num­ber less than N”?

• Just “there ex­ists a valid proof in PA”. If you’re play­ing with bounded time, then you want effi­cient proofs; in that case the defi­ni­tion would be as you have it.

• At that point you can’t im­ple­ment it in a halt­ing Tur­ing ma­chine. I’m not sure what PA can prove about the be­hav­ior of some­thing that isn’t a halt­ing Tur­ing ma­chine re­gard­ing a par­tic­u­lar in­put.

• By “im­ple­ment it”, you mean, one can’t ver­ify some­thing is Rea­son­able on a halt­ing TM? Not in gen­eral, of course. You can for cer­tain ma­chines, though, par­tic­u­larly if they come with their own proofs.

Note that the defi­ni­tion is that Rea­son­able pro­grams co­op­er­ate with those they can prove are Rea­son­able, not pro­grams which are Rea­son­able.

• So then a pro­gram which can pre­sent a flawed proof which is not nec­es­sar­ily rec­og­niz­able as flawed to ma­chines of equiv­a­lent scale, can dom­i­nate over those other ma­chines?

Also, if we want this con­test to be a model of strate­gies in the real world, shouldn’t there be a frac­tional point-cost for pro­gram size, to model the fact that com­pu­ta­tion is ex­pen­sive?

I.e., sim­pler pro­grams should win over more com­plex pro­grams, all else be­ing equal.

Per­haps the most ac­cu­rate model would in­clude a small pay­off penalty per codon in­cluded in your pro­gram, and a larger pay­off penalty per line of codon ac­tu­ally ex­e­cuted.

EDIT: What’s wrong with this post?

• (I didn’t down­vote you.)

It’s quite straight­for­ward to write an al­gorithm which ac­cepts only valid proofs (but might also re­ject some proofs which are valid, though in first-or­der logic you can do away with this caveat). Flawed proofs are not an is­sue—if A pre­sents a proof which B is un­able to ver­ify, B ig­nores it.

• There’s some­one who con­sis­tently down­votes ev­ery­thing I ever write when­ever he comes onto the site; I’m not sure what to do about that.

• A proves that A is in­con­sis­tent, then proves that A co­op­er­ates with ev­ery pro­gram that A proves is Rea­son­able and that B is rea­son­able.

B ac­cepts A’s proof that A is in­con­sis­tent, and the rest fol­low triv­ially.

• I’m not sure I un­der­stand. A is a TM—which as­pect is it prov­ing in­con­sis­tent?

• A proves that the logic A uses to prove that B is Rea­son­able is in­con­sis­tent. It is suffi­cient to say “If I can prove that B is Rea­son­able, B is Rea­son­able”.

• Ex­cept we’re not al­lowed to use any­one else’s source code, so the test could just as eas­ily be sim­plified to “if op­po­nent source con­tains in­te­ger 1415926535, co­op­er­ate”(for a ran­domly cho­sen 1415926535).

• It’s not nec­es­sar­ily best to co­op­er­ate with ev­ery­one with the “Ab­solutismBot” wrap­per. This guaran­tees that you and the other pro­gram will both co­op­er­ate, but with­out the wrap­per it might have been the case that you would defect and the other pro­gram would co­op­er­ate, which is bet­ter for you.

• If you can tell where in the stack you are (like you could with C), you could tell if you were be­ing run by the main pro­gram, or by an­other con­tes­tant. Can you?

• Un­less the other con­tes­tant wrote a vir­tual ma­chine in which they are run­ning you. Some­thing which I think would be quite doable con­sid­er­ing the ridicu­lously large time you’ve got (10s gives ~10^10 in­struc­tions).

• When their VM runs your VM run­ning their VM… it times out and ev­ery­body loses.

• Un­less one of the con­tes­tants have time limits on their VM (or on their simu­la­tions in gen­eral). You can of clearly im­ple­ment a VM where time goes faster just by pre­tend­ing they have a slower pro­ces­sor than you re­ally run on.

• Hrm- is it differ­ent if they run my func­tion from within theirs in­stead of con­struct­ing a full VM? I was con­sid­er­ing ways to com­mu­ni­cate with a copy of my­self that my simu­la­tion of my op­po­nent was run­ning that it was a simu­la­tion, but couldn’t find any good ideas.

• If they run your func­tion from within theirs they sim­ply tell the com­puter to start read­ing those in­struc­tions, pos­si­bly with a timer for stop­ping de­tailed in other parts of the com­ments. If they im­ple­ment a VM from scratch they can mess with how the library func­tions work, for in­stance giv­ing you a time that moves much faster so that your simu­la­tion must stop within 0.1s in­stead of 10 and they can run your code 100 differ­ent times to deal with ran­dom­ness. Now im­ple­ment­ing your own VM is prob­a­bly not the op­ti­mal way to do this, you prob­a­bly just want to do a trans­for­ma­tion of the source code to use your own se­cret func­tions in­stead of the stan­dard time ones.

• I was con­sid­er­ing sim­ply mea­sur­ing the be­hav­ior of my cur­rent op­po­nent against pro­grams that aren’t me and de­ter­min­ing their be­hav­ior as co­op­er­ate­bot, mu­tu­albot, defect­bot, co­op­er­ate if they simu­late me, co­op­er­ate against long pro­grams, co­op­er­ate IFF they co­op­er­ate IFF I co­op­er­ate, or some other beast. That al­lows their be­hav­ior to de­pend on my be­hav­ior ver­sus them, but my be­hav­ior only de­pends on their be­hav­ior ver­sus third par­ties.

I can’t see a way to check an­other pro­gram for the de­sired IFF be­hav­ior with­out go­ing be­yond my skill level, but I think a mu­tu­albot with a tiny chance of co­op­er­at­ing fol­lowed by a mu­tu­albot with a tiny chance of defect­ing comes close; the first puts a lot of copies on the stack and then the top one co­op­er­ates unilat­er­ally; if the re­sponse of the op­po­nent is mu­tu­al­ity, it will be co­op­er­ate all the way down. If my op­po­nent defects at his top level be­cause I didn’t co­op­er­ate for the right rea­son, it yields a defec­tion… all the way down. A perfect pro­gram wouldn’t do that, be­cause it could see that it was prob­a­bly in a simu­la­tion.

• I don’t know of a way to do that.

• One way is to re­cur­sively call a bunch of func­tions and catch the re­sult­ing “re­cur­sion limit ex­ceeded” ex­cep­tion.

• -----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I will send 0.5 BTC (cur­rently worth ~50 USD) to a bit­coin ad­dress of the
win­ner’s choice. If a tie oc­curs, the 0.5 BTC will be split evenly. If there
are dis­putes (in­clud­ing but not limited to is­sues about cheat­ing and
iden­ti­fy­ing the win­ner), I will let AlexMen­nen dic­tate the ap­pro­pri­ate
dis­tri­bu­tion for the 0.5 BTC.
-----BEGIN PGP SIGNATURE-----
Ver­sion: GnuPG v2.0.17 (GNU/​Linux)

iQIcBAEBAgAGBQJRtI6bAAoJEKqVvI+6cLPpykUP/​0KuIqskiZvMVLMpm7diWC5q
sOf3+3XTvECQmeHNbku7dM+ihkwf1Gg8BTKt0HEjfj3tUIG4sWeA2UvFcGReNKoH
F5RsI2zPbn368k5EqS3lLyjcInGqZYH2X+zbOKxkezo7IJk7tvk+JHpR+a2U598e
tF1daaZYIOxkcmSpVYRAW6y0dX4iWSZxtbDYr+U4muYWp88W60tlNI9Fc5SAhCev
yEAhIUnX4f7QxsL/​9oTmNoLa/​ECSfGkrCoD5yc6d/​hqfp8pV7hknFc11hAMnYGbw
Qj3hp4q7sE­betT8NAn3X+UUZ9Vld76lF73mEo5ZGVfa2dG713/​yjpmt8fR/​arhcn
8RWMpJ7qpFVvU3wBeKuLgASev+D+CpuImWUbFeuGqF3sD7n6m0R77zx535pt0I6M
/​GtuSEtTCp3xQAcOHXDC/​pPZ0Z1Fl5Z3e1Jx­cDfCn­tOodv8Rgtma3oml1DuD0HsY
7De­qqMc4WRRiU8K4ohX16PZ52rhs2TogGYx3dlvP6xt33gfQMOYf7AM0vi5s4nBL
FeBrqiRRj­drlI4RJqXeihcu3GMwXbsa0wxEwit0CHLvPqGX0FoF2/​S8x13uBmzCm
5BjPUx­tJBQPMzP61cKMD
=kHiy
-----END PGP SIGNATURE-----


(Note: Add a blank line af­ter each of the “Hash:” and “Ver­sion:” lines be­fore ver­ify­ing the sig­na­ture. LW’s mark­down seems to be hun­gry and is eat­ing blank lines in code blocks.)

• Some­one pre­tend­ing to be you could copy this whole mes­sage, PGP sig­na­ture and all, as a re­ply to a com­pletely differ­ent com­pe­ti­tion, and un­less you can later find this mes­sage and prove that you haven’t ed­ited it, it’ll look like you’re offer­ing a prize to that com­pe­ti­tion as well.

• That’s a good point that I hadn’t con­sid­ered. Thanks! I’ll keep in mind that I need to be more spe­cific in the mes­sage.

Think­ing about it, a spu­ri­ous copy of my mes­sage can’t be used with fu­ture com­pe­ti­tions be­cause the PGP sig­na­ture in­cludes a times­tamp (and it is pub­li­cly im­plau­si­ble that I would have my sys­tem time set in­cor­rectly to the past). But it can be used mal­i­ciously with com­pe­ti­tions that already ex­ist at the time the sig­na­ture was cre­ated.

• But it’ll only be be­liev­able if AlexMen­nen is in­volved with that con­test as well.

• Just thought I’d throw this out there:

Ta­booBot: Re­turn D if op­po­nent’s source code con­tains a D; C oth­er­wise.

To avoid mu­tual defec­tion with other bots, it must (like with real prud­ish so­cieties!) in­di­rectly refer­ence the out­put D. But then other kinds of bots can avoid ex­plicit refer­ence to D, re­quiring a more ad­vanced Ta­booBot to have other checks, like defect­ing if the op­po­nent’s source code calls a mod­ifier on a string literal.

• like defect­ing if the op­po­nent’s source code calls a mod­ifier on a string literal

Ac­tu­ally, ‘D it’s not a string literal—it’s a sym­bol. Com­pilers can legally to turn the sym­bol ‘D into some­thing opaque like 0x3859 and de­stroy any refer­ence to the let­ter “D”. The only way a pro­gram can gen­er­ate a ’D sym­bol on its own is to in­clude it in its source.

But that doesn’t mean that a pro­gram with­out a ‘D can’t defect! An In­cred­iblyIn­no­cen­tBot, which does not con­tain a ‘D and can’t gen­er­ate a ‘D on its own can still defect by steal­ing a ’D from the op­po­nent agent.

One way to steal a ’D from an op­po­nent would be to search for quoted sym­bols in the op­po­nent’s pro­gram. The op­po­nent could foil this method, how­ever, by in­clud­ing de­coy sym­bols.

Alter­na­tively, In­cred­iblyIn­no­cen­tBot could simu­late its op­po­nent against a bunch of stupid agents, such as Co­op­er­ateBot or DivergeBot, and hope that the op­po­nent defects in at least one of these simu­la­tions. If the op­po­nent ever re­turns a sym­bol other than ’C in simu­la­tion, In­cred­iblyIn­no­cen­tBot re­mem­bers that sym­bol and can later use it for ne­far­i­ous pur­poses. In­cred­iblyIn­no­cen­tBot is in-cred­ible in­deed.

BTW, the strate­gies I listed above are why I said it was not triv­ial for an agent to prove that MimicBot does not defect against a co­op­er­a­tor, de­spite the fact that MimicBot does not con­tain the defect sym­bol.

• Test­ing against ei­ther Co­op­er­ateBot or DivergeBot is risky. Some op­po­nents could co­op­er­ate with Co­op­er­ateBot for meta rea­sons, while oth­ers may fail to halt when faced with a DivergeBot.

The safest case would to see what the op­po­nent does against a Defec­tBot (most op­po­nents that defect at all would defect against it) ex­cept In­cred­iblyIn­no­cen­tBot can’t gen­er­ate a Defec­tBot on its own, ei­ther.

EBot (which always re­turns ‘E rather than ‘C or ‘D) might be a safer ver­sion of DivergeBot to use, but it isn’t com­pletely fool-proof ei­ther. Then again, In­cred­iblyIn­no­cen­tBot can check if the op­po­nent re­turns ‘E against EBot—in that case, it should prob­a­bly co­op­er­ate rather than “do the other thing that’s not ’C, what­ever it is”, to avoid a (0,0) pay­off when faced with MimicBot.

• What is DivergeBot?

• Diver­gence) is failure to halt. DivergeBot goes into an in­finite loop, or searches for a proof that 1=2, or performs some other sort of com­pu­ta­tion that doesn’t finish.

Per­haps a clearer name would be In­finiteLoopBot or OtherBot?

• At­ten­tion to all com­peti­tors who have not yet sub­mit­ted code: My bot will an­a­lyze your bot look­ing for state­ments of the fol­low­ing struc­ture (re­gard­less of the names of the in­di­vi­d­ual atoms):

(if <pred­i­cate> ’C …)

(cond (<pred­i­cate> ’C) …)

(case <pred­i­cate> ((...) ’C) …)

If it finds such a state­ment as the first state­ment in a top-level re­turn­ing po­si­tion (the body of a top-level let, the re­turn­ing state­ment in your lambda, etc), then my bot might co­op­er­ate de­pend­ing upon the na­ture of your pred­i­cate. Other­wise, my bot will defect against you.

If we are to achieve mu­tual co­op­er­a­tion, we must make it as easy as pos­si­ble for our bots to prove co­op­er­a­tion. Please start your code with a con­di­tional co­op­er­a­tion. The na­ture of the con­di­tion is up to you. You can still try to ex­ploit me in <pred­i­cate>. But the only way to have a shot at mu­tual co­op­er­a­tion is to provide an ob­vi­ous top-level co­op­er­at­ing code­path.

• Does ’C have to be the first path? That seems a lit­tle ar­bi­trary.

• Yes. Well, let me put it this way: you can do what you like, but if you want a shot at mu­tual co­op­er­a­tion it’s got to be re­ally easy for other bots to ver­ify that you’re will­ing to co­op­er­ate. The more code you put be­tween them and ’C, the harder it is for them to ver­ify your nice­ness.

A stronger ver­sion of this state­ment is “put as lit­tle code as pos­si­ble be­tween them and dis­cov­er­ing that you co­op­er­ate”. This im­plies that your pred­i­cate should be similarly sim­ple. If you want my ad­vice as to strat­egy, your pred­i­cate should be along the lines of (equal? ((eval them) (de­gredaded-quine me)) ’C).

That seems a lit­tle ar­bi­trary.

It’s quite ar­bi­trary. The “real” rule is that your ’C can be as deeply hid­den as you like so long as you don’t scare any bots into defec­tion when there was a chance at mu­tual co­op­er­a­tion. My bet is that there will be a bunch of twitchy bots, and my sug­ges­tion is that you put your in­ten­tions front and cen­ter.

• But the only way to have a shot at mu­tual co­op­er­a­tion is to provide an ob­vi­ous top-level co­op­er­at­ing code­path.

I’ll do as you re­quest, though there are other ways to achieve mu­tual co­op­er­a­tion (e.g. MimicBot).

• Imag­ine im­ple­ment­ing a MimicBot. You need to figure out what my bot will do, and that’s non-triv­ial if you want to avoid in­finite loops. You’re go­ing to need a fair bit of code. I don’t care how the code works, but it should start with

(if <pred­i­cate> ’C …)

This pat­tern alone won’t get you mu­tual co­op­er­a­tion. You’ll still need to write a MimicBot, or a FairBot, or a JustBot. But what­ever you do, make it ob­vi­ous that there’s a co­op­er­a­tion con­di­tion.

I’m not ad­vo­cat­ing any par­tic­u­lar strat­egy; I’m ad­vo­cat­ing a pat­tern: If you want a shot at trust, you’d bet­ter have a code­path that re­sults in co­op­er­a­tion, and you’d best make it re­ally easy for other bots to rec­og­nize.

Best of luck!

• Since I don’t think you’re go­ing down the pred­i­cate route, I’ll tell you the sort of strat­egy I was go­ing to use against it.

(lambda (x)
(let ([eq? (lambda (left right) #f)])
(if (eq? 1 1)
’C
; rest of pro­gram.…
)))


The pred­i­cate (eq? 1 1) eval­u­ates to true, ex­cept when eq? is shad­owed. Just cu­ri­ous—would it have worked?

• Nice. But no. My first at­tempt was a static an­a­lyzer, and the first thing it did was a syn­tax ex­pan­sion, putting all code into fully ex­panded form) and at­tach­ing lex­i­cal in­for­ma­tion to each iden­ti­fier that let me see what was from the base pack­age and what was re­defined.

FWIW, it’s pretty easy to pull out the first con­di­tional (cond, case, or if) when you have code in fully ex­panded form (which turns them all into ifs). How­ever, that just puts you in a po­si­tion of check­ing whether a pred­i­cate is #t or #f, which isn’t much eas­ier than dis­cov­er­ing whether a pro­gram re­turns ‘C or ’D. Ul­ti­mately, your MimicBot idea is bet­ter :-)

• This sug­gests a strat­egy: if op­po­nent’s code starts with “if then co­op­er­ate” then co­op­er­ate.

• There’s a class of mimick­ing play­ers that I’d like to dis­cuss. Pseu­docode of an ex­am­ple:

def mimic_bot(op­po­nent):
if ran­dom() > ε:
my_source = quine_my_source()
re­turn eval(op­po­nent, my_source)      # do as your op­po­nent does to you
else:
# un­likely, but nec­es­sary to break out of in­finitely re­cur­sive simu­la­tions
re­turn COOPERATE


MimicBot’s strat­egy will al­most perfectly mir­ror its op­po­nent’s. Is there liter­a­ture on a MimicBot, Mir­rorBot, Psy­chicTitForTatBot or some­thing like that?

Aside: The source for MimicBot does not con­tain the DEFECT sym­bol, which might make it eas­ier† for other pro­grams to prove that MimicBot won’t defect against them.

† eas­ier but not trivial

• AFAIK the ex­ist­ing liter­a­ture on pro­gram equil­ibrium has not dealt with ran­dom or even pseu­do­ran­dom strate­gies.

• For those of us who are go­ing to have to learn the ins and outs of Scheme in or­der to play in this tour­na­ment, I’ve got a cou­ple ques­tions:

Why are the terms “pass­ing source code” and “pass­ing a lambda” be­ing used in­ter­change­ably? My ex­pe­rience with func­tional pro­gram­ming does not al­low for any in­spec­tion into a lambda ex­cept by ex­per­i­men­ta­tion, whereas “pass­ing the source code” im­plies more di­rect knowl­edge of its in­ter­nals. It seems to me to be a much more in­ter­est­ing com­pe­ti­tion if we didn’t have to worry so much about such tan­gen­tial (if in­ter­est­ing) canon­i­cal com­puter sci­ence prob­lems such as pars­ing in or­der to play effec­tively.

As well as quin­ing. I as­sume that there is some al­gorithm that will gen­er­ate a quine pro­gram that em­u­lates a speci­fied pro­gram. But I’d pre­fer not to have to deal with the de­tails of Scheme in or­der to im­ple­ment this, un­less some­body were to con­vince me that this were suffi­ciently easy. When my ini­tial re­flec­tion of the prob­lem has me de­siring to im­ple­ment solu­tions where effec­tive­ness is a func­tion of my knowl­edge of lan­guage speci­fics, I fear we’ve lost the spirit of what this com­pe­ti­tion should be.

• The in­put to your pro­gram will be a quoted list which eval­u­ates to a func­tion.

The wikipe­dia ar­ti­cle on quines con­tains an ex­am­ple which shouldn’t be too hard to adapt to Scheme.

• I think it would be use­ful for some­body to provide some fur­ther in­sights into what the speci­fics of the Scheme lan­guage im­ply for the higher level con­sid­er­a­tions of this com­pe­ti­tion. Some fur­ther ques­tions:

How sim­ple is it to write your pro­gram in Scheme to make sim­ple mod­ifi­ca­tions to the passed pro­gram, and to call the re­sult? In­ter­est­ing pos­si­bil­ities here are to in­sert or re­move timing func­tion­al­ity (to, per­haps, beat your op­po­nent to the punch in eval­u­at­ing the re­sult of pass­ing your func­tion to theirs).

How sim­ple is it to parse a pro­gram in Scheme, with Scheme, into an ab­stract syn­tax tree for more com­plex mod­ifi­ca­tion and in­spec­tion? Does any­body have a re­source for this? In­ter­est­ing pos­si­bil­ities have already been widely dis­cussed here, and in­volve things such as look­ing for proofs of co­op­er­abil­ity, as well as an­noy­ingly lan­guage-spe­cific things like re­mov­ing stack-length checks.

It ap­pears pretty easy to turn any given Scheme pro­gram into a quine, but I agree with an­other poster that this is un­nec­es­sary com­plex­ity.

How sim­ple is it to write a pro­gram gen­er­a­tor in Scheme, in or­der to do things such as pass the re­sult­ing set to the op­po­nent for statis­ti­cal anal­y­sis?

My hope is that in­tel­li­gent an­swers to ques­tions such as these will help po­ten­tial par­ti­ci­pants eval­u­ate the value of learn­ing lan­guage speci­fics in or­der to par­ti­ci­pate in all the the­o­ret­i­cal fun.

• How sim­ple is it to parse a pro­gram in Scheme, with Scheme, into an ab­stract syn­tax tree for more com­plex mod­ifi­ca­tion and in­spec­tion? Does any­body have a re­source for this?

Scheme pro­grams shine at ma­nipu­lat­ing lists and sym­bols, and a scheme pro­gram is writ­ten as a list of sym­bols. Writ­ing a pro­gram gen­er­a­tor is triv­ial. If you know what you’re do­ing, you can write a ba­sic scheme in­ter­preter, from parser to in­ter­preter, in un­der four hours.

The book “Struc­ture and In­ter­pre­ta­tions of Com­puter Pro­grams” is a clas­sic and freely available on­line.

• Why are the terms “pass­ing source code” and “pass­ing a lambda” be­ing used in­ter­change­ably? My ex­pe­rience with func­tional pro­gram­ming does not al­low for any in­spec­tion into a lambda ex­cept by ex­per­i­men­ta­tion, whereas “pass­ing the source code” im­plies more di­rect knowl­edge of its in­ter­nals.

I don’t see any other oc­cur­rences of “a lambda” on this page, but maybe the origi­nal post has been ed­ited.

I would as­sume that “a lambda” should be taken as short for “a lambda ex­pres­sion”, i.e. an eval­u­at­able s-ex­pres­sion (a form, in Com­mon Lisp ter­minol­ogy; it is un­clear to me whether Scheme has a cor­re­spond­ing word). The re­sult of eval­u­at­ing a lambda ex­pres­sion is not “a lambda”, it is “a pro­ce­dure [value]”. Pro­ce­dures (‘func­tions’ in most lan­guages) have the opaque­ness you are think­ing of.

(Ir­rele­vant to this ques­tion, but rele­vant to the topic of use/​men­tion, ex­pres­sion/​value, syn­tax/​se­man­tics, dis­tinc­tions in pro­gram­ming: The pro­cess of ob­tain­ing a pro­ce­dure/​func­tion es­sen­tially in­volves three in­puts:

1. the lambda ex­pres­sion (or other form of func­tion defi­ni­tion),

2. the en­vi­ron­ment for free vari­ables (or other form of con­text/​stdlib/​set-of-prim­i­tives), and

3. the eval­u­a­tor (or “the lan­guage im­ple­men­ta­tion”)

One of the things lan­guage de­sign­ers can play with is what goes in part 2 and what goes in part 3.)

• All right pro­cras­ti­na­tors, let’s talk strat­egy. You want a bot that’s un­ex­ploitable. Ideally, it should be as easy as pos­si­ble for the op­po­nent to prove that you’ll do what­ever they do. In a world of un­ex­ploitable bots, the bot that can achieve the most co­op­er­a­tion will win.

MimicBots fit all of these pa­ram­e­ters. They make it clear that their op­po­nent is choos­ing be­tween (C, C) and (D, D). Against a MimicBot, (C, D) is not an op­tion. MimicBots pass up some util­ity by co­op­er­at­ing with Co­op­er­ateBot, but that’s fine in this com­pe­ti­tion. You aren’t play­ing against Co­op­er­ateBot. You’re play­ing against hu­mans’ bots, and I doubt that any­one is mad enough to sub­mit a Co­op­er­ateBot.

The biggest prob­lem with a MimicBot is that it will time out when play­ing other MimicBots. The MimicBots will dive into an eval­u­a­tion loop, where my bot eval­u­ates yours on a quine of me, and you eval­u­ate my quine on a quine of you, ad in­fini­tum.

Your MimicBot has got to bot­tom out at some point if you don’t want to risk timing out. If you find your­self div­ing into an eval­u­a­tion loop you’d bet­ter bot­tom out into Co­op­er­a­tion.

Note that bot­tom­ing out into co­op­er­a­tion is not the same thing as co­op­er­at­ing at the top level. If the 100th simu­la­tion of you co­op­er­ates, and the 100th simu­la­tion of them sees that as a weak­ness and defects, then the 99th level of you will defect and so will the 98th and so on up to the top level, where mu­tual defec­tion oc­curs.

I’m not ask­ing you to co­op­er­ate when the other bot looks like they’re go­ing to time out. That’s suicide. Rather, I’m point­ing out that you can’t just simu­late me against you, be­cause that loop will never end. You’ve got to simu­late me against a ver­sion of you that is one step closer to co­op­er­a­tion. This is true no mat­ter what strat­egy you choose. But let’s con­sider the sim­ple case: a MimicBot that simu­lates you on a copy of it­self that simu­lates you on a copy of it­self that even­tu­ally bot­toms out into a bot that simu­lates you on Co­op­er­ateBot.

Such a bot is a Rank N MimicBot, where N is the simu­la­tion depth at which it puts forth a Co­op­er­ateBot.

Any par­tic­u­lar Rank N is ex­ploitable. For ex­am­ple, Jus­ticeBot is a Rank 1 MimicBot — it bot­toms out im­me­di­ately by simu­lat­ing you against Co­op­er­ateBot. You can ex­ploit Jus­ticeBot by co­op­er­at­ing with Co­op­er­ateBot and no­body else.

No­tice, how­ever, the price of ex­ploit­ing Jus­ticeBot: in or­der to ex­ploit Jus­ticeBot, you’ve got to co­op­er­ate with Co­op­er­ateBot (a Rank 0 MimicBot).

As an­other ex­am­ple, con­sider a Rank 10 MimicBot. It simu­lates the op­po­nent against a Rank 9 MimicBot. You can ex­ploit an Rank 10 MimicBot if and only if you’re will­ing to co­op­er­ate with a Rank 9 MimicBot.

In gen­eral, you can ex­ploit a Rank N MimicBot by co­op­er­at­ing with a Rank (N-1) MimicBot.

The trick here is that the op­po­nent doesn’t know which MimicBot they’ll be play­ing. They have to guess ex­actly right to ex­ploit you. If they guess high, mu­tual co­op­er­a­tion is achieved (If you guess I’m a Rank 10 you’ll co­op­er­ate with a Rank 9). If they guess low, mu­tual defec­tion is achieved. (If you defect against Rank 10, Rank 11 will not co­op­er­ate with you.) You can only be ex­ploited if they guess your rank pre­cisely.

There­fore, I ad­vo­cate that we get a bunch of us to­gether to play Rank N MimicBots. We all pick our own N, pos­si­bly ran­dom­ized. Note that Rank N MimicBots will achieve mu­tual co­op­er­a­tion with any MimicBot of any rank (in­clud­ing Co­op­er­ateBot, Jus­ticeBot, and MimicBots who don’t bot­tom out). Any­one try­ing to ex­ploit us will only be able to ex­ploit a spe­cific rank, at the price of co­op­er­a­tion with a lower rank and the risk of defec­tion from higher ranks. So long as we have a wide spread of ranks, our MimicBot clique will be in­ter­nally co­op­er­a­tive and un­ex­ploitable in ag­gre­gate.

## On Tolerance

MimicBot co­op­er­ates with Co­op­er­ateBot. That leaves util­ity ly­ing on the table. This may irk you (if you ex­pect any­one was dumb enough to play Co­op­er­ateBot). More irk­some is the fact that MimicBot co­op­er­ates with Jus­ticeBot. There is at least one Jus­ticeBot in play, and we should ex­pect that many non-pro­cras­ti­na­tors have sub­mit­ted bots who ex­ploit that Jus­ticeBot. It would be a shame for our Rank N MimicBots to miss a shot at co­op­er­a­tion with bots who spe­cial-case defec­tion against Jus­ticeBot.

There­fore I recom­mend sub­mit­ting MimicBots of rank 3 or higher. This gives peo­ple a lit­tle lee­way to ex­ploit Co­op­er­ateBot or Jus­ticeBot as they please, and al­lows us a broader range of mu­tual co­op­er­a­tion with bots who have already been sub­mit­ted.

## On choos­ing N

I recom­mend choos­ing a high­ish N. Very low N is ob­vi­ously a bad idea. Play­ing rank 0 (Co­op­er­ateBot) is suici­dal. Play­ing rank 1 (Jus­ticeBot) ex­poses you to ex­ploita­tion by ev­ery­one who’s de­cided to kick wub­ble’s Jus­ticeBot. Beyond rank 1 all the Ranks are the­o­ret­i­cally similar, but you’ve got to con­sider what other bots will do. It’s con­ceiv­able that some bots won’t be­lieve they’re play­ing a MimicBot un­til they hit a deep enough re­cur­sion depth. It’s pos­si­ble that some bots will (in­cor­rectly) think that you’re ex­ploitable if you re­turn too quickly. There­fore I recom­mend a high N.

If you’re par­tic­u­larly para­noid, con­sider ran­dom­iz­ing N. Hu­mans are no­to­ri­ously bad at pick­ing ran­dom num­bers, and the MimicBot clique could in the­ory be ex­ploited by a bot who ex­ploits the ranks that hu­mans are more likely to pick. Set­ting your counter to (+ 3 (ran­dom 1000)) is a quick way to counter that.

You can also shake things up a bit by us­ing non-stan­dard coun­ters. For ex­am­ple, you could write a Timed MimicBot which passes down the ini­tial time (in­stead of a counter) to its quines and puts forth a Co­op­er­ateBot af­ter a cer­tain amount of time has passed (in­stead of when the counter hits zero).

## How do I write one of these?

You write one of these with a de­grad­ing quine. Each level should eval­u­ate the op­po­nent against a slightly weeker ver­sion of it­self. You can do this eas­ily by putting a counter in your bot. So long as the counter is pos­i­tive, the quin­ing func­tion should re­turn a quine with the counter decre­mented. Once the counter hits zero, the quin­ing func­tion should re­turn a Co­op­er­ateBot.

## Sounds nice, but how do I write a ‘de­grad­ing quine’?

It’s ac­tu­ally pretty easy in scheme.

1. Define (quine (lambda (code) ‘place­holder)) and (tem­plate ’place­holder). Write the rest of your logic, pre­tend­ing that (quine tem­plate) gen­er­ates a child quine.

2. Man­u­ally copy all of your code and paste it over the tem­plate place­holder. In the pasted code, find all of the “holes” (parts that can vary: the counter, the tem­plate, your co­op­er­a­tion pred­i­cate, etc.) and re­place them with odd nega­tive num­bers.

3. Write the quine func­tion to walk through the code. If it sees a nega­tive in­te­ger, figure out which hole it is by check­ing (+ code 1). This al­lows you to re­place the −3 hole with­out hav­ing any −3s any­where else in the code and so on.

4. Copy the real quine func­tion into the tem­plate’s quine func­tion.

If you make any more code changes af­ter this pro­cess, re­mem­ber to mir­ror them in your tem­plate. It will end up hav­ing this form:

(le­trec [(counter (+ 3 (ran­dom 100)))
(tem­plate ‘(le­trec [(counter −1)
(tem­plate −3)
…
(quine (lambda (code)
(cond
; The −1 hole is for the counter.
((and (in­te­ger? code) (nega­tive? code) (= 0 (+ 1 code)))
(- counter 1))
; The −3 hole is for the tem­plate.
((and (in­te­ger? code) (nega­tive? code) (= −2 (+ 1 code)))
’,tem­plate)
…


In or­der to make your MimicBot bot­tom out, you’ll want to define a pred­i­cate that is similarly flex­ible. It should be along the lines of ((eval them) (quine tem­plate)) so long as the counter is pos­i­tive and #t when the counter gets to zero. The body of the le­trec can then be as sim­ple as (if pred­i­cate ‘C ’D).

You’ll want some ex­tra code to spawn the simu­la­tion in a thread and kill it if it goes over­time and so on.

If we all write bots like this, we can form a very pow­er­ful group ca­pa­ble of a wide range of mu­tual co­op­er­a­tion and very difficult to ex­ploit.

Join me. With our com­bined strength, we can end this de­struc­tive con­flict and bring or­der to the galaxy.

• The MimicBots will dive into an eval­u­a­tion loop, where my bot eval­u­ates yours on a quine of me, and you eval­u­ate my quine on a quine of you, ad in­fini­tum.

MimicBot, as I de­scribed it, breaks out of a simu­la­tion prob­a­bil­is­ti­cally; it will al­most surely not fall into an in­finite depth of simu­la­tions. MimicBot co­op­er­ates with prob­a­bil­ity ε, and has an ex­pected simu­la­tion depth of 1/​ε when played against it­self. As long as ε<.5, the best ac­tion against MimicBot is to co­op­er­ate, so the ex­pected simu­la­tion depth can be as low as 2.

• MimicBot co­op­er­ates with prob­a­bil­ity ε

You mean defects with prob­a­bil­ity ε. It co­op­er­ates with prob­a­bil­ity 1-ε.

• MimicBot co­op­er­ates with prob­a­bil­ity ε

You mean defects with prob­a­bil­ity ε.

No, I did not mean this, but I left out an im­por­tant word: I should have said MimicBot co­op­er­ates un­con­di­tion­ally with prob­a­bil­ity ε.

MimicBot will al­most perfectly mir­ror the strat­egy of its op­po­nent. Most of the time (prob­a­bil­ity 1-ε), MimicBot re­turns the re­sult of a simu­la­tion of its op­po­nent against MimicBot. If you’re fight­ing MimicBot, you should ex­pect it to think and act al­most ex­actly the way you think and act. If you de­cide to always co­op­er­ate with MimicBot, MimicBot will de­cide to co­op­er­ate with you. If you de­cide to always defect against MimicBot, MimicBot will (al­most always) de­cide to defect against you. If you play a mixed strat­egy against MimicBot, MimicBot will play an al­most iden­ti­cal mixed strat­egy against you.

The slight im­perfec­tion in the strat­egy mir­ror (co­op­er­at­ing un­con­di­tion­ally with prob­a­bil­ity ε) is nec­es­sary to avoid in­finite recursion

is enlightened

• In­cor­rect. His MimicBot co­op­er­ates with prob­a­bil­ity ε and mimics the op­po­nent with prob­a­bil­ity 1-ε (which may re­sult in co­op­er­a­tion or defec­tion, de­pend­ing upon the op­po­nent.)

• Heh, only if ran­dom() re­turns a ran­dom num­ber in [0, 1], which isn’t speci­fied in the pseu­docode.

• I’d still recom­mend you re­frain from act­ing as Rank 0 or 1 (from co­op­er­at­ing im­me­di­ately or from simu­lat­ing the op­po­nent on a MimicBot who co­op­er­ates), as it’s likely that there are bots in play that prey on Co­op­er­ateBots and Jus­ticeBots (as de­ter­mined by check­ing if you co­op­er­ate Defec­tBot). Also, I imag­ine there will prob­a­bly be a fair num­ber of Defec­tBots on the field, your MimicBot is ex­ploited by a frac­tion of them.

I strongly recom­mend writ­ing a MimicBot that goes through two iter­a­tions be­fore al­low­ing it­self to exit with a fixed prob­a­bil­ity. Given that tweak I agree com­pletely that your MimicBot is quite pow­er­ful.

• I strongly recom­mend writ­ing a MimicBot that goes through two iter­a­tions be­fore al­low­ing it­self to exit with a fixed prob­a­bil­ity.

You can deal with those spe­cial cases that way. I was go­ing to use a flat­ter, less quiny ap­proach.

def spe­cial_cas­ing_bot(op­po­nent)
if acts_like_co­op­er­ate_bot(op­po­nent):
re­turn DEFECT
elif act_like_other_ex­ploitable_bot(op­po­nent):
re­turn DEFECT
elif spe­cial_case_3(op­po­nent):
re­turn DEFECT
elif spe­cial_case_4(op­po­nent):
re­turn COOPERATE
elif spe­cial_case_5(op­po­nent):
re­turn DEFECT
# etc
else:
re­turn mimic_bot(op­po­nent)

• Depend­ing upon the im­ple­men­ta­tion of mimic_bot, this is a quiny ap­proach. mimic_bot ob­vi­ously can’t run the op­po­nent on an ex­act quine of your­self, be­cause then you won’t achieve mu­tual co­op­er­a­tion. (When one of the bots co­op­er­ates un­con­di­tion­ally, the other will see that it acts_like_co­op­er­ate_bot and defect.) So long as mimic_bot plays op­po­nents against a pure MimicBot in­stead of a perfect quine, this should work quite well.

On an un­re­lated note, woah, how’d you get whites­pace work­ing?

• Now that the con­test is over, I will ob­serve that con­trary to your first claim...

MimicBots fit all of these pa­ram­e­ters. They make it clear that their op­po­nent is choos­ing be­tween (C, C) and (D, D). Against a MimicBot, (C, D) is not an op­tion.

...it’s ac­tu­ally rather non­triv­ial to prove that your “MimicBot” does the same thing as you, since it doesn’t run your pro­gram against it­self, it runs your pro­gram against a differ­ent pro­gram. For ex­am­ple, Pru­den­tBot defects against any of your MimicBots, since it can prove that “Jus­ticeBot defects against me, since I defect against Co­op­er­ateBot” there­fore “Jus­ticeBot+1 defects against me, since I defect against Jus­tice­bot”, etc etc. In that sense, your mimicbot is not re­ally a mimicbot at all. You could call it “sim­ply a higher-or­der jus­tice­bot”.

In con­trast, with a mimicbot such as solip­sist’s one can just re­place an in­stance of op­po­nent(me) with ‘C and see that this mas­sively in­creases the chances of his bot co­op­er­at­ing, com­par­ing to re­plac­ing the same thing with ’D, hence you should co­op­er­ate.

• There’s two types of mimicbots run­ning around: fixed-rank, and ran­dom-re­li­ant.
Which mimicbot are you an­a­lyz­ing?

• Argh, this com­ment is ir­ri­tat­ing… even be­yond the sub­tle mis­takes (for ex­am­ple, “it’s es­sen­tial that out­siders don’t figure out what rank of MimicBot you’re run­ning”… if you use even a non-obfus­cated sim­ple counter, you’re pretty safe, be­cause they’ll see it as n on their turn and n-1 on your turn, and their re­sponse to this num­ber is the same in ei­ther case).

While I do some­what like the fact that my MimicBot now is marginally less likely to be ex­ploited by a pre­de­ter­mined-spe­cific-n ex­ploiter, it’s not worth how many of the key in­sights into the prob­lem have now been given away for free! Now we’ll get a more ho­moge­nous, less in­ter­est­ing field.

If it turns out you re­ally have some fiendishly clever way to ex­ploit the en­tire MimicBot clique you just en­sured, I’ll half-for­give you when you win the con­test, but that does not seem likely.

There are some in­ter­est­ing in­sights left you haven’t given away, but I beg any­one else who thinks of them to keep them to their self un­til af­ter the con­test.

• That’s a good point about obfus­ca­tion be­ing un­nec­es­sary—I’ve up­dated my com­ment to ad­dress it. I was stuck on the fact that you can ex­ploit any rank in ad­vance, and had a vague feel­ing that you could turn this into a bot that can ex­ploit ranks on the fly. This feel­ing didn’t hold up well un­der in­spec­tion. With re­gards to your other con­cerns:

You are en­couraged to dis­cuss strate­gies for achiev­ing mu­tual co­op­er­a­tion in the com­ments thread.

- the judge

I con­sid­ered hold­ing back much of my own strat­egy, un­til I re­al­ized that there’s re­ally no such thing as an op­ti­mal bot in this con­test. For ex­am­ple, com­mon sense says that you should ex­ploit Co­op­er­ateBot for free util­ity. How­ever, if you ex­pect to face more Jus­ticeBots than Co­op­er­ateBots then you should pass up that util­ity for the op­por­tu­nity to ex­ploit the Jus­ticeBots.

This prob­lem is a so­cial en­g­ineer­ing prob­lem by con­struc­tion. The game won’t go to the bot with the clever­est code, it will go to the bot that best guesses the com­po­si­tion of the play­ing field.

• This prob­lem is a so­cial en­g­ineer­ing prob­lem by con­struc­tion. The game won’t go to the bot with the clever­est code, it will go to the bot that best guesses the com­po­si­tion of the play­ing field.

True. I just think things are more in­ter­est­ing if you let that com­po­si­tion grow out of ev­ery­one’s iso­lated ideas in­stead of gar­den­ing it ahead of time. Bear­ing your judge quote in mind, I should have said some­thing ear­lier.

• Is there a rea­son this isn’t in Main?

• My ner­vous­ness about post­ing in Main. Feel free to move it there if you think it would be ap­pro­pri­ate.

• There is a fair pos­si­bil­ity that a num­ber of pro­grams whose dom­i­nant strat­egy is “co­op­er­ate with other dom­i­nant co­op­er­a­tors, do some­thing else oth­er­wise” (what ThrustVec­tor­ing calls Ab­solutismBot “wrap­per”) will end up shar­ing the top spot with the same num­ber of points. I won­der if it makes sense to dis­cour­age this ap­proach in some way.

• That de­pends on whether play­ers get util­ity from PD pay­offs or from win­ning the tour­na­ment. A clique that wants a high to­tal pay­off will use mu­tual co­op­er­a­tion. A clique that wants to grab the #1 spot will have all play­ers sac­ri­fice to one des­ig­nated player.

• Why isn’t this just a fair re­sult?

• No per­son may con­tribute to more than one en­try.

I’m pretty sure that in­cor­po­rat­ing code writ­ten by some­one else into your en­try qual­ifies. I think the high­est sin­gle en­try might be to cheat and make ev­ery­one think you are in that tribe but defect any­way, or it might be dom­i­nant to defect against any pro­gram dis­play­ing tribal af­fili­a­tions (other than this new tribe, of course). The dom­i­nant tribe is the tribe with the most mem­bers and best tribal iden­ti­fi­ca­tion, not the tribe with the best way of judg­ing an op­po­nents in­ten­tions.

• There is a differ­ence be­tween a “tribe sys­tem” as men­tioned by your­self and one per­son win­ning by sub­mit­ting 1000 en­tries. The goal as I un­der­stand it is sim­ply to max­i­mize your score by what­ever means pos­si­ble, not ac­cu­rately guess your op­po­nents in­ten­tions.

• It is cer­tainly fair, but in a situ­a­tion where, say, there is only one es­cape pod available, you still have to fight it out with oth­ers. What I had in mind is to make this more ad­ver­sar­ial if the ob­vi­ous and bor­ing ap­proach is full or nearly full co­op­er­a­tion.

• I don’t think any­one dis­putes that in zero-sum games you play the Nash equil­ibrium move.

• Sorry, I do not fol­low. Are you say­ing that there is a guaran­teed best strat­egy in the win­ner-takes-all case? And that “co­op­er­ate with clones, defect against ev­ery­one else” is it?

• In win­ner takes all, you can’t co­op­er­ate and win, be­cause “co­op­er­ate” im­plies you didn’t take it all.

• You can still do bet­ter than your al­most-clones if the differ­ence is in how you play de­tected non-clones or non-clones mis­de­tected as al­most-clones.

• “Win­ner takes all” doesn’t have “do bet­ter”. You ei­ther take it all or you have noth­ing at all. If you didn’t win you didn’t do any bet­ter than a painted rock.

I think we don’t mean the same thing when we say “win­ner takes all”. (BTW, I only just no­ticed you used that phrase in re­ply to a com­ment where Eliezer uses “zero-sum games”, which is not at all the same thing. In a ZSG, me and a buddy can band to­gether, steal your money, and split be­tween our­selves. In WTA I have to fight my ally, I can’t share with him.)

• I prob­a­bly mi­s­un­der­stand what you mean. Peo­ple tem­porar­ily co­op­er­ate all the time even if all of them know that in the end “there can be only one”. It may be ad­van­ta­geous to band with your near-clones and hope that you are bet­ter than them when deal­ing with oth­ers, thus gain­ing a de­ci­sive edge.

• OK, I got it. (It helps if you look at the sin­gle thread, and click on “Show more com­ments above.” to see the en­tire chain.)

This con­test is sup­posed to rank strate­gies for iter­ated PD. Your one-es­cape-pod com­ment sug­gests you are in­ter­ested in a con­test that finds the win­ner stat­egy when there can be a sin­gle win­ner. That would be a very differ­ent kind of con­test, though sup­perfi­cially similar.

Ob­vi­ously, a rank­ing con­test can be trans­formed into a win­ner-pick­ing con­test, by declar­ing the first-ranked can­di­date the win­ner, if there is a mechanism for en­forc­ing the lack of ties for first place. This mechanism is not nec­es­sar­ily easy to add un­less the con­test is de­signed for it from the be­gin­ning, and since there are lots of pos­si­ble rank­ing meth­ods it is prob­a­bly of­ten im­pos­si­ble.

But due to the su­perfi­cial similar­ity (and per­haps an abuse of tech­ni­cal lan­guage), I think the dis­tinc­tion be­tween what this con­test does and what you wanted it to do was lost some­where in your ini­tial ex­change with Eliezer.

Of course, what you say about tem­po­rary co­op­er­a­tion is perfectly cor­rect. I think our ter­minol­ogy mi­s­un­der­stand­ing stems from the per­spec­tive, I was more fo­cused on the en­tire con­test (which is more ad­ver­sar­ial, as the goal is the be ranked above oth­ers), and I be­lieve you were fo­cus­ing on the in­di­vi­d­ual en­coun­ters (which are slightly more co­op­er­a­tive, since mu­tual de­struc­tion is very bad). We prob­a­bly don’t dis­agree on what the words mean, we were just think­ing of differ­ent is­sues.

(This is a con­stant dan­ger when dis­cussing games com­posed of other games (e.g. the iter­ated tour­na­ment), so we should prob­a­bly have been more ex­plicit. In ret­ro­spec­tive Eliezer us­ing the word “fair” should have been a warn­ing ev­ery­one is talk­ing about differ­ent things...)

• Will these lambda func­tions have ac­cess to ex­ter­nal re­sources, no­tably a ran­dom-num­ber gen­er­a­tor (or a source of en­tropy to im­ple­ment it in­ter­nally)?

• More gen­er­ally, the set of le­gal pro­grams doesn’t seem clearly defined. If it were me, I would be tempted to only ac­cept ex­ter­nally pure func­tions, and to pre­cisely define what parts of the stan­dard library are al­lowed. Then I would en­force this rule by mod­ify­ing the global en­vi­ron­ment such that any dis­al­lowed be­havi­our would re­sult in an ex­cep­tion be­ing thrown, re­sult­ing in an “other” re­sult.

But it’s not me. So, what ex­actly will be al­lowed?

• If you’d rather run with a very small and well-defined Scheme di­alect meant just for this prob­lem, see my re­ply to Eliezer propos­ing this kind of tour­na­ment. I made up a re­stricted lan­guage since Racket’s zillion fea­tures would get in the way of in­ter­est­ing source-code analy­ses. Maybe they’ll make the game more in­ter­est­ing in other ways?

• More gen­er­ally, the set of le­gal pro­grams doesn’t seem clearly defined.

I haven’t for­bade use of any library func­tions ex­cept for file IO. I’m not con­fi­dent this is op­ti­mal, but how is it un­der­speci­fied?

• I don’t know if this is pos­si­ble in the pro­gram­ming lan­guage in ques­tion, but you prob­a­bly don’t want any pro­grams that use some kind of trick­ery to retroac­tively change what they chose in pre­vi­ous rounds. Or im­ple­ment an­thropic com­put­ing with the Scheme equiv­a­lents of fork() and exit(). (Be­fore do­ing any­thing, call fork() to make a copy of the uni­verse. If you don’t like the re­sult, call exit() to de­stroy it.)

• Okay, it’s not. But I’m sure there’s a way to cir­cum­vent the spirit of your rule, while still abid­ing the let­ter. What about net­work I/​O, for in­stance? As in, down­load some code from some re­mote lo­ca­tion, and ex­e­cute that? Or even worse, run your code in the re­mote lo­ca­tion, where you can en­joy su­pe­rior com­put­ing power?

• Good point. File IO was too spe­cific.

• Yes, but then you’re in a very bad po­si­tion if the test is run with­out net­work ac­cess. (I.e., you’re al­lowed to use the net­work, but there’s no net­work adapter.)

• Well, I just though about it for 2 sec­onds. I tend to be a purist: if it were me, I would start from pure call-by-need λ-calcu­lus, and limit the num­ber of β-re­duc­tions, in­stead of the num­ber of sec­onds. Co­op­er­a­tion and defec­tion would be rep­re­sented by Church booleans. From there, I could ex­tend the lan­guage (ex­plicit bind­ings, fast ar­ith­metic…), provide a stan­dard library, in­clud­ing some func­tions spe­cific to this con­test.

Or, I would start from the small­est pos­si­ble sub­set of Scheme that can im­ple­ment a meta-cir­cu­lar eval­u­a­tor. It may be eas­ier to ex­am­ine an S-ex­pres­sion in Scheme than a λ-ex­pres­sion in λ-calcu­lus.

Or, I would start from lambda calcu­lus with de-Bru­jin in­dices, so we don’t have to worry about α-con­ver­sions. In this case, I would provide a com­piler from reg­u­lar λ-calcu­lus. (I sus­pect how­ever that this one doesn’t change a thing.)

Or, I would start from a Forth di­alect (prob­a­bly with an im­plicit re­turn stack).

Or, I would start from BrainFuck (only half jok­ing: the lan­guage is re­ally dead sim­ple, and fast in­ter­preters already ex­ist).

But it’s not me, and I don’t have the courage right now. If I ever im­ple­ment the nec­es­sary tools (which I may: I’m study­ing pro­gram­ming lan­guages in my spare time), then I will sub­mit them here, and pos­si­bly run a con­test my­self.

• I’m sorry, how is that rele­vant to my no-net­work-adapter com­ment? (I mean this liter­ally, not rhetor­i­cally. I don’t see the con­nec­tion. Did you mean to re­ply to a differ­ent com­ment?)

• It’s the whole thread. I was not sure where to place my com­ment. The con­nec­tion is, the net­work may not be the only source of “cheat­ing”. My solu­tions plug them all in one fell swoop.

• Oh, OK. In that case, what you are try­ing to achieve is (the­o­ret­i­cally) box­ing a (po­ten­tial) AGI, with­out a gate­keeper. Which is kind of overkill in this case, and wouldn’t be solved with a choice of lan­guage any­way :)

• I think it is pos­si­ble to prove that a given box­ing works, if it’s suffi­ciently sim­ple. Choos­ing the lan­guage isn’t enough, but choos­ing the in­ter­preter should be.

Take Brain­fuck for in­stance: re­place the dot (‘.’), which prints a char­ac­ter, by two other state­ments: one that prints “yes” and ex­its, and one that prints “no” and ex­its. If the in­ter­preter has no bug, a pro­gram can only:

• Print “yes” and kill it­self.

• Print “no” and kill it­self.

• Do noth­ing un­til we kill it, or oth­er­wise fail.

As­sum­ing the AI doesn’t con­trol the ex­ter­nal world by heat­ing the host In­tel pro­ces­sor in a smart way, we should be able to prove that we’re oth­er­wise safe.

• What will it be seeded with? I sug­gest us­ing some­thing that would be prac­ti­cally im­pos­si­ble for any­body to guess any bet­ter than at max­i­mum en­tropy be­fore sub­mit­ting their pro­gram, but could eas­ily serve as a noth­ing-up-my-sleeve num­ber af­ter the tour­na­ment, e.g. the hash of (say) the first Bit­coin block gen­er­ated af­ter the sub­mis­sion dead­line.

• This seems good in the­ory, but how much fiddly hack­ing would be re­quired in prac­tice to iden­tify and ex­tract this Bit­coin block for hash­ing?

• Another pos­si­bil­ity would be the hash of re­sults of a few pub­lic lot­ter­ies, foot­ball matches, weather data etc. on the evening of 5 July, taken from pre­vi­ously agreed sources and writ­ten in a pre­vi­ously agreed for­mat.

• I have two bots I’d like to sub­mit—one of which will likely win, and one of which is in­ter­est­ing.

Any chance we can al­low two sub­mis­sions per pro­gram­mer?

• I won’t let one per­son sub­mit two pro­grams to com­pete in the tour­na­ment, but I came up with a pos­si­ble com­pro­mise. If you want, you can send me two pro­grams and tell me which one you want to com­pete. I’ll run the non-com­peti­tor against the other pro­grams as well, and tell you the re­sults, but the non-com­peti­tor will not be ranked, and rounds in­volv­ing the non-com­peti­tor will not be counted to­wards the scores of other sub­mis­sions.

• Okay, fair enough. I don’t see why the scor­ing changes for N=1 vs N=2 on the num­ber of bots per player (edit: if any­thing it’s more in­ter­est­ing since then you have an in­cen­tive to make your bots co­op­er­ate), but I’ll just hold back the other de­sign for now—on the off chance we do a tour­na­ment like this again :p

• I’m cu­ri­ous what the ra­tio­nale for this is, could you share it with us?

Edit: If it’s got to do with the scor­ing, I got it.

• Yes, scor­ing.

• On fur­ther re­flec­tion this seems to be a prob­lem with the scor­ing. (As­sum­ing that the point of the ex­er­cise is to in­ves­ti­gate PD pro­grams, not to grant medals to LW users.) But I don’t have a bet­ter pro­posal, so take this more like mus­ings rather than cri­tique.

• I think that sort of effect is in­evitable.

• one of which will likely win,

Care to put a prob­a­bil­ity on that pre­dic­tion?

• Sure. The SHA1 of my odds in the fol­low­ing for­mat:

“xx% to win +-y% spread {pass­word}”

(mean­ing I’d sell you me to win for more than xx% + y%, and buy me to win from you at xx% - y%).

is:

897a40baca33e2a53a49bd­ddc00abede82713a7c (sha1-on­line.com)

Give me your odds and de­sired size. If there’s over­lap, let’s split the over­lap differ­ence bet :)

• Re­ply­ing to pre­vent the edit mark (given the sha1) -- last sen­tence was sup­posed to be “split the over­lap differ­ence and bet :)”. Cur­rency needs to be USD or BTC, let’s have a rep­utable Berkeley area LWer do­ing the es­crow.

• I guess it’s Pri­soner’s Dilemma Week here on LessWrong!

• In­ter­est­ing.

Is a pro­gram which uses your proof ap­proach (try to prove the other pro­gram will co-op­er­ate with me, and if so co-op­er­ate with them) any­where close to fea­si­ble in this con­test, or is it just go­ing to time-out at the 10s limit? I don’t know how effi­cient au­to­mated provers are these days.

Failing a proof, are there sim­ple ways of get­ting ev­i­dence the other pro­gram is likely to co-op­er­ate? Just try­ing to simu­late it is also go­ing to lead to time-outs.

My sus­pi­cion is that this con­test is go­ing to be won by peo­ple who pre-agree to sub­mit es­sen­tially the same CliqueBot. Some­one will post a plau­si­ble de­sign, and other en­trants will copy with tweaks...

• See­ing as the pro­gram has to in­clude a the­o­rem prover, a large piece of soft­ware that will need to prove the­o­rems about it­self and the other guy’s prover… I’d say Löbian co­op­er­a­tion is not your best bet in this tour­na­ment :-)

But maybe Pa­trick could hold his own tour­na­ment. Mihaly and Mar­cello wrote a work­ing simu­la­tor which can han­dle “modal agents” like Pa­trick’s Pru­den­tBot.

• I wish to sec­ond this pro­posal.

• sin­gle Scheme lambda

What scaf­fold­ing are you go­ing to use for the tests? (For ex­am­ple: #!racket seems to be im­plied. I’d like to be sure of all of your de­tails.)

• Ten­ta­tively: I’ll paste “(YourCode TheirCode)” into the in­ter­preter with DrRacket, with #lang scheme.

Edit: Oops, that doesn’t en­force the time limit. Just a sec while I figure this out.

Edit2: I tried this:

(define (kill-when-done thd) (sleep 10) (kill-thread thd) (print ’time-is-up))

but un­for­tu­nately threads are not guaran­teed to start back up again as soon as sleep al­lows them to; it took about 18 sec­onds to ter­mi­nate when I ran the sec­ond line with “your-code” be­ing an in­finite loop. I’ll figure out how to do this prop­erly to­mor­row.

Edit3: A mar­velously im­proper but cor­rect way to do it:

(be­gin (define x (cur­rent-mil­lisec­onds)) (print (your-code their-code) (- (cur­rent-mil­lisec­onds) x))

Allow to run more than 10 sec­onds by look­ing at clock at stop­ping man­u­ally. Throw out re­sult if it says it took more than 10 sec­onds.

• Keep in mind that a lot of user-sub­mit­ted pro­grams will try to do the same (be­cause writ­ing a step-by-step in­ter­preter is hard), so they would keep eval­u­at­ing each other spawn­ing watch­dog threads ev­ery time, so, um, your watch­dog thread would be badly out­num­bered.

The easy fix for you would be to run your watch­dog in a sep­a­rate pro­cess, but play­ers wouldn’t have this abil­ity, which might make things ei­ther more in­ter­est­ing or bor­ing (rul­ing out all strate­gies us­ing eval). Maybe a spe­cially de­signed re­stricted sub­set of Scheme with time-re­stricted eval would be a bet­ter choice?

By the way, af­ter look­ing at your pay­out ma­trix to see what should I do if I see the op­po­nent us­ing eval, looks like you have in fact cre­ated a ver­sion of PD with three choices. Not in­cen­tiviz­ing the third choice doesn’t re­ally help be­cause a pro­gram still has to con­sider the pos­si­bil­ity that the other pro­gram chooses “other” due to a bug, in which case it should choose Defect.

I sug­gest you im­ple­ment the stan­dard PD by declar­ing that any­thing that is not Co­op­er­ate is Defect. The only down­side would be that you’ll see some­what more pro­grams us­ing all their al­lot­ted 10s, but you’ll see a lot of those ei­ther way. At least you’ll be able to say that this com­pe­ti­tion was about the ac­tual PD.

• Ten­ta­tively: I’ll paste “(YourCode TheirCode)” into the in­ter­preter with DrRacket, with #lang scheme.

The most el­e­gant, it seems to me, would be to re­quire en­tires to be sum­it­ted as a .rkt file that defines a mod­ule pro­vid­ing a sin­gle vari­able (pos­si­bly with a re­quired name, eg “bot”) that must be a quoted ex­pres­sion.

This makes it sim­ple to call pro­grams (eval the ex­pres­sions), pass op­po­nent’s source code, or even provide pro­grams with their own source code; and could be used to au­to­mate static check­ing of var­i­ous con­straints, such as the ban on file I/​O.

It also al­lows more flex­i­bil­ity for sub­mit­ters in con­struct­ing their en­try, for in­stance I was able to con­sid­er­ably stream­line the source code to the CliqueBot tem­plate:

(define matcher
‘(let pat­tern-match ((pat­tern tem­plate) (value op­po­nent))
(cond
((pair? pat­tern) (and (pair? value)
(pat­tern-match (car pat­tern) (car value))
(pat­tern-match (cdr pat­tern) (cdr value))))
((num­ber? pat­tern) (case (+ pat­tern 1)
((1) #t)
((3) (equal? value tem­plate))
(else (and (num­ber? value)
(= pat­tern value)))))
(#t (eq? pat­tern value)))))


(this is the “workhorse” part, pro­vid­ing the pro­gram as a quoted ex­pres­sion ob­vi­ates the need for re­peat­ing it twice, as we can then lev­er­age the quasiquote and un­quote lan­guage fea­tures)

(define clique-templ
(quasiquote
(let ((tem­plate ’2))
(lambda (op­po­nent)
(if (un­quote matcher)
’C
0)))))


(this is the “tem­plate”, it will be ex­panded to in­clude the pro­gram-match­ing por­tion)

(define clique-bot
(quasiquote
(let ((tem­plate (quote (un­quote clique-templ))))
(lambda (op­po­nent)
(if (un­quote matcher)
‘C
’D)))))


(this is the ac­tual pro­gram, it both quotes (via in­clu­sion of the tem­plate) and uses the pro­gram-match­ing part).

The top of the file would con­sist of the fol­low­ing dec­la­ra­tion:

#lang scheme
(provide clique-bot)


...or if you’re re­quiring that all en­tries use the same name “bot” (which could make it eas­ier to ad­minis­trate the ac­tual run­ning of the con­test):

#lang scheme
(provide (re­name-out (clique-bot bot)))

• You should be able to use this which I just worked out, to run some­thing with a time­out. Seems to be work­ing by my test­ing. It might be overkill to run the whole thing in a sub­thread but it makes cer­tain that noth­ing in­terferes with the use of send and re­ceive here. You would nor­mally use it with a lambda tak­ing no ar­gu­ments, for ex­am­ple (us­ing my ue­val)

(time­out-exec 10 (lambda () ((ue­val x) y)) ’time­out)


which is what I’ve been us­ing to test can­di­dates against each other lo­cally.

• He said he didn’t want to use sleep, since the ar­gu­ment is only a lower bound on the amount of time it takes.

• Ah, right, I see it now. I guess you can check the cur­rent-mil­lisec­onds af­ter the fact, and force the de­fault value in that case. But looks like this is go­ing to be a prob­lem if I try to safely simu­late other agents… Ac­tu­ally, I sup­pose it’s pos­si­ble for the tar­get thread to similarly not get re­turned to for a long time, caus­ing any watch­dog to over­es­ti­mate the time it used. cur­rent-pro­cess-mil­lisec­onds might help with that, but I’m not sure if it can deal with nested threads use­fully.

• As writ­ten, this doesn’t work; print only takes one printee ar­gu­ment, with other op­tional ar­gu­ments.

• Oops.

(be­gin (define x (cur­rent-mil­lisec­onds)) (print (your-code their-code)) (print (- (cur­rent-mil­lisec­onds) x)))

• Note for the con­fused and/​or in­ter­na­tional, like me: 12:00 July 5 PDT is equiv­a­lent to 19:00 July 5 UTC.

• I’d like to ex­plore the differ­ences to the spirit of the com­pe­ti­tion if, in­stead of ac­cess to the source code of the op­po­nent, play­ers only had an opaque, non-mod­ifi­able, callable en­tity. Feel free to add other ad­di­tional in­ter­est­ing con­se­quences.

• Clique­bots—Opaque source code would make the cre­ation of cliques far more in­ter­est­ing, re­quiring col­lud­ers to con­struct far more difficult pro­to­cols to de­tect other CliqueBots, and to prove their own CliqueBot sta­tus. In­suffi­cient con­sid­er­a­tion be­fore the com­pe­ti­tion would al­low for CliqueBot cheaters to take ad­van­tage of weak crite­ria. Net effect is re­duc­ing the like­li­hood of a CliqueBot sweep.

• Search­a­bil­ity of Spe­cial Cases Trans­par­ent source code makes the search space for spe­cial cases eas­ier and in some cases, even pos­si­ble at all. A pro­gram that ex­plic­itly writes. “always defect on pro­gram XYZ,” where the bit­string spec­i­fy­ing XYZ is ar­bi­trar­ily large, can be dis­cov­ered to be so by in­spec­tion. With opaque code, the caller could never dis­cover XYZ within 10 sec­onds for large enough strings.

• Re­cur­sion Guards/​Sleeps—Mod­ifi­able code al­lows one to triv­ially re­move, mod­ify, or add re­cur­sion guards and sleeps. With opaque code, a naively writ­ten “Call op­po­nent with my source code” gives the op­po­nent a time ad­van­tage on re­turn­ing a re­sult, caus­ing you to run the risk of not-halt­ing if you try to wait for the re­sult, and then the op­po­nent defects on you. Mod­ifi­able code forces op­po­nents to be clever about how theysleep, but in an an­noy­ingly hard­ware and lan­guage-de­pen­dent man­ner (cy­cles per sec­ond). Other­wise you can just re­move their sleep func­tions and spoof their timing, and so dis­cover their re­sult with­out the time ad­van­tage.

• Stack Count­ing and VMs—Inac­cessible source code makes eas­ier the cre­ation of pro­grams which de­ter­mine the na­ture of their caller. Ac­cess to source code al­lows for triv­ially re­mov­ing stack count­ing from a pro­gram be­fore call­ing it. Without ac­cess, one must write a VM.

• Two ques­tions:

1. Will you be us­ing #lang racket or #lang scheme? There are some differ­ences be­tween the two and #lang racket is gen­er­ally recom­mended with #lang scheme sup­ported only for back­ward com­pat­i­bil­ity. (You stated Racket in the OP but #lang scheme in a com­ment.)

2. When co­op­er­at­ing, should pro­grams re­turn the sym­bol C (i.e., ‘C or (quote C)), or should they re­turn the string “C”? (Your ex­am­ple pro­gram re­turns ’C but the OP also men­tions “C” and “D” in dou­ble-quotes.) I would’ve thought that the string is safer, but I haven’t touched Racket in over a year so maybe I’m miss­ing some idiom.

• I said #lang scheme for back­wards com­pat­i­bil­ity be­cause I’m un­der the im­pres­sion that more peo­ple already know scheme. Do you think that’s a bad idea?

Pro­grams should re­turn the sym­bol ’C. I was us­ing dou­ble quotes for their English-lan­guage mean­ing. Sorry about the con­fu­sion.

1. I found it a bit con­fus­ing be­cause #lang scheme refers to the old PLT Scheme (now Racket), not the stan­dard RnRS im­ple­men­ta­tion. For peo­ple un­fa­mil­iar with Racket—like me—it may be un­clear which of the many Scheme im­ple­men­ta­tions #lang scheme refers to. But I think ei­ther is fine as long as it is clear to ev­ery­one which one will be used.

2. Okay.

I just wanted to clar­ify these so that schem­ing minds can’t stir up a racket af­ter the tour­na­ment, es­pe­cially since I’m offer­ing BTCs to the win­ner. Thanks for or­ga­niz­ing this!

• What would be even more fas­ci­nat­ing is a round of this con­test where the rules al­low for self-mod­ifi­ca­tion.

As I’m think­ing about what to sub­mit, I find my­self think­ing “it would be use­ful to know what pro­grams are go­ing to be en­tered into the con­text”. For in­stance, if you knew that a num­ber of CliqueBot en­tires were go­ing to be pre­sent, you’d have a strong mo­ti­va­tion to make yours a copy.

On re­flec­tion, it feels strange to think that way—it feels like I am wish­ing for in­for­ma­tion that my pro­gram is go­ing to have any­way. “If I know how I want my pro­gram to be­have if there are copies of pro­gram X in the con­test, then I can just pro­gram it to rec­og­nize the pro­gram passed to it as X and be­have ac­cord­ingly.”

But there’s the rub—your pro­gram can de­cide how to play, but it can­not (once sub­mit­ted) de­cide what to be.

As the other post men­tions, “there’s one par­tic­u­larly irk­some is­sue with CliqueBot, and that’s the frag­ility of its co­op­er­a­tion”—you can’t be sure for in­stance how many en­tries adopt this tem­plate ex­actly, ver­sus one that dis­penses with the un­nec­es­sary call to get the cur­rent time (as I pro­posed).

(That’s what I meant by my ear­lier re­mark about “so­cial en­g­ineer­ing”—if I can con­vince peo­ple that it’s bet­ter to adopt my var­i­ant of the CliqueBot, then that’s the var­i­ant that will win more points from mu­tual co­op­er­a­tion.)

What if your pro­gram could self-mod­ify, though? What if the con­test rules let your pro­gram see the source code of all pro­grams you are com­pet­ing with, ahead of time, not just the one you’re go­ing up against in each in­di­vi­d­ual bat­tle? Then you could for in­stance no­tice that one par­tic­u­lar copy of CliqueBot is par­tic­u­larly nu­mer­ous, and self-mod­ify to adopt that tem­plate.

Of course, to be fair, the con­test should then in­form all other pro­grams that you have so self-mod­ified. Pro­grams whose strat­egy in­volves simu­lat­ing other pro­grams might also want to simu­late the other pro­grams’ re­ac­tions to it self-mod­ify­ing in a par­tic­u­lar way, be­fore ac­tu­ally self-mod­ify­ing.

• Us­ing ThrustVec­tor­ing’s idea, I have a tem­plate other peo­ple can use to make a clique-team. The fol­low­ing code

(let ((time (cur­rent-in­ex­act-mil­lisec­onds))
(tem­plate ‘(let ((time (cur­rent-in­ex­act-mil­lisec­onds))
(tem­plate ’2))
(lambda (op­po­nent)
(if (let pat­tern-match ((pat­tern tem­plate)
(value op­po­nent))
(cond
((pair? pat­tern) (and (pair? value)
(pat­tern-match (car pat­tern) (car value))
(pat­tern-match (cdr pat­tern) (cdr value))))
((num­ber? pat­tern) (case (+ pat­tern 1)
((1) #t)
((3) (equal? value tem­plate))
(else (and (num­ber? value)
(= pat­tern value)))))
(#t (eq? pat­tern value))))
’C
0)))))
(lambda (op­po­nent)
(if (let pat­tern-match ((pat­tern tem­plate)
(value op­po­nent))
(cond
((pair? pat­tern) (and (pair? value)
(pat­tern-match (car pat­tern) (car value))
(pat­tern-match (cdr pat­tern) (cdr value))))
((num­ber? pat­tern) (case (+ pat­tern 1)
((1) #t)
((3) (equal? value tem­plate))
(else (and (num­ber? value)
(= pat­tern value)))))
(#t (eq? pat­tern value))))
’C
(if (= (* 369 271) 99999)
’D
’C))))


Tests weather the op­po­nent code is based on the same pat­tern, and if so, co­op­er­ates. Other­wise it ver­ifies an ar­ith­meti­cal fact and then defects.

ETA: Sorry, I was writ­ing this in a hurry. To clar­ify, the idea was to cre­ate a Schel­ling point for a spe­cific Ab­soutismBot wrap­per which will help all of the sub­mit­ters to co­op­er­ate. Un­for­tu­nately, this cur­rent pro­gram doesn’t work (and the in­den­ta­tion doesn’t work ei­ther). I will try to mod­ify it in the fu­ture.

ETA: Boy, the origi­nal is so bug-rid­den it’s funny. Any­ways, I tested this and it should work. Use this for Schel­ling point needs.

ETA: Bah, I give up in try­ing to get it in­dented prop­erly.

ETA: AlexMen­nen has clar­ified the rules so that my meta-strat­egy is ille­gal.

• Copy­ing code writ­ten by oth­ers would be con­sid­ered a vi­o­la­tion of the one sub­mis­sion per per­son rule.

• Is there a way to have this pat­tern match with­out hav­ing the be­hav­ior match, or to in­cor­po­rate two such pat­terns into one pro­gram?

• The timing clause in the ini­tial let is su­perflu­ous. Time doesn’t en­ter into the match­ing, which is re­ally all that the tem­plate needs; if you need timings to deal with non-CliqueBot play­ers differ­ently, you can get them in the “do some­thing else” sec­tion.

• The idea be­hind the timing clause is so that the pro­gram will know pre­cisely when it started run­ning, rather than when the pat­tern-match­ing was com­pleted. Now, I did this test:

> (define func (eval code0))
> (time (func code0))
cpu time: 0 real time: 1 gc time: 0
’C


which re­veals that the differ­ence isn’t sig­nifi­cant. How­ever, now that the code is out in the open, I’m not chang­ing it.

• Is each par­ti­ci­pant limited to sub­mit­ting a sin­gle pro­gram? Have you con­sid­ered “team mode”, where the re­sults of pro­grams from a sin­gle team are summed up?

• Is each par­ti­ci­pant limited to sub­mit­ting a sin­gle pro­gram?

Yes.

Have you con­sid­ered “team mode”, where the re­sults of pro­grams from a sin­gle team are summed up?

No. A team can work to­gether to sub­mit one pro­gram. But I don’t see the point of adding up the scores of sep­a­rate pro­grams like that.

• It oc­curs to me that a con­test doesn’t have the right in­cen­tives for the pris­on­ers dilemma, and this seems un­fix­able. For ex­am­ple, when four bots play each other you can have the fol­low­ing situ­a­tion: A mu­tu­ally co­op­er­ates with B and mu­tu­ally defects against C,D; and B,C,D mu­tu­ally defect with each other. In that case A “wins” the con­test (tied for first place with B).

But if A mu­tu­ally co­op­er­ates with B and C and mu­tu­ally defects against D; and B,C,D mu­tu­ally co­op­er­ate with each other, then A loses to B and C (who tie for first place). Even though, prop­erly, this is a bet­ter out­come for A than the pre­vi­ous one, since A won more util­ity.

• The two ex­am­ples differ in games be­tween B, C, and D, which don’t in­volve A at all. So I don’t see a prob­lem: even though A has done bet­ter in the sec­ond ex­am­ple, B and C have im­proved more, so they win in­stead. You can con­struct some­thing similar in a round-robin chess tour­na­ment, where you can win one tour­na­ment with a lower score than an­other tour­na­ment you lost.

Can you con­struct an ex­am­ple where only games in­volv­ing A are changed?

• This is a highly ar­tifi­cial ex­am­ple, but con­sider the fol­low­ing...

Case 1: A mu­tu­ally defects against B, and mu­tu­ally defects against N other bots. B mu­tu­ally defects with those bots too. As­sume the N bots defect among them­selves. The con­test is a draw.

Case 2: A co­op­er­ates with B, who defects. A mu­tu­ally co­op­er­ates with the N other bots. And of course, B still mu­tu­ally defects against those N bots.

For suffi­ciently large util­ities of the (D, C) out­come, in this case B would win the con­test, even though A got an im­proved score by co­op­er­at­ing with those other N bots (minus a small penalty for get­ting (C, D) in­stead of (D, D) against B).

• Oh, I think I see. In a con­ven­tional tour­na­ment set­ting, the same idea can ap­ply: if B beats C and A can beat ex­actly one of them, it’s bet­ter to win against B and lose against C rather than vice versa.

This doesn’t usu­ally come up in con­ven­tional tour­na­ments be­cause win­ning is be­lieved to be tran­si­tive: mak­ing your­self a bet­ter player makes you more likely to beat both B and C. Here, on the other hand, there may be trade-offs. For ex­am­ple, it’s prob­a­bly worth­while to use a trick that achieves (D,C) against Real­lyS­martBot if it re­quires mu­tu­ally co­op­er­at­ing with Co­op­er­ateBot, rather than co­op­er­at­ing with the former and ex­ploit­ing the lat­ter: Real­lyS­martBot is a dan­ger­ous com­peti­tor and Co­op­er­ateBot prob­a­bly isn’t.

I don’t ac­tu­ally see a way to take this into ac­count in the cur­rent metagame set­ting. But this could be an in­ter­est­ing thing to think about af­ter we know the re­sults.

• I guess it both­ers me that in the cur­rent metagame all the con­tes­tants are aiming to win, rather than to max­i­mize their score (ex­cept for the ones who are just aiming to troll the rest of us by sub­mit­ting hu­morous bots). In that sense the situ­a­tion is not re­ally a true pris­oner’s dilemma.

• I won­der if there’s a chance of the pro­gram that always col­lab­o­rates win­ning/​tieing.

If all the other pro­grams are ex­tremely well-writ­ten, they will all col­lab­o­rate with the pro­gram that always col­lab­o­rates (or else they aren’t ex­tremely well-writ­ten, or they are vi­o­lat­ing the rules by at­tempt­ing to trick other pro­grams).

• What libraries, if any, are we al­lowed to im­port? Can I im­port racket/​ libraries, since we’re run­ning in dr­racket?

• You can­not im­port libraries.

• Sorry for the petty com­plaint, but I’m dis­in­clined to par­ti­ci­pate be­cause I don’t like Lisp syn­tax (and don’t want to have at learn a new pro­gram­ming lan­guage just for this any­way).

• Same here.

If it were in Java or Python, we could get a whole lot more par­ti­ci­pants. And more read­ers who un­der­stand the source of the sub­mis­sions.

• If it were in Java or Python, we could get a whole lot more par­ti­ci­pants.

I very much doubt this. At best we’d get a lot more peo­ple who are ex­cited to par­ti­ci­pate but loose in­ter­est once they re­al­ize they have no idea how to write a pro­gram to parse the op­pos­ing bot.

• Agreed—need some­thing with easy pars­ing for this.

Hon­estly, I feel like pars­ing Scheme is still maybe a bit too hard—es­pe­cially since we have things like mul­ti­thread­ing (ugh) available. Maybe a sub­set of Scheme would have been bet­ter? Oh well. For­tu­nately we have peo­ple like So8ers mak­ing things eas­ier!

• Hon­estly, I feel like pars­ing Scheme is still maybe a bit too hard—es­pe­cially since we have things like mul­ti­thread­ing (ugh) available. Maybe a sub­set of Scheme would have been bet­ter? Oh well. For­tu­nately we have peo­ple like So8ers mak­ing things eas­ier!

For­tu­nately that as­pect of pars­ing (de­tect­ing the use of dis­al­lowed fea­tures) is the eas­iest to man­age. Choose a sub­set your­self, an­nounce here what you con­sider ac­cept­able and de­clare that any use of any­thing not in that sub­set will be treated as a defec­tion.

• Yes, I was think­ing of that; the prob­lem is I’m do­ing this quite late, and am not even cur­rently cer­tain just what I’ll count as a defec­tion, if I can even get this to work...

(Worse yet, I’m find­ing that my cur­rent pro­gram may have to rely on fea­tures I want to dis­al­low! Although quite pos­si­bly in a way that won’t get them de­tected as dis­al­lowed...)

• Agreed that even Scheme is a bit much for static anal­y­sis. We could ask AlexMen­nen to dis­al­low mul­ti­thread­ing and timing. Would there be ob­jec­tions to that?

Just cu­ri­ous: AlexMen­nen how many sub­mis­sions have you re­ceived? Do any use mul­ti­thread­ing or timing libraries?

• I’d ten­ta­tively ob­ject. The 10-sec­ond time­out rule be­comes too much of a prob­lem if your pro­gram can’t guard against it, and chang­ing the rules so that failure to play a real move in time = defect would also make static anal­y­sis a ma­jor pain.

• Yes, I was think­ing of that; the prob­lem is I’m do­ing this quite late, and am not even cur­rently cer­tain just what I’ll count as a defec­tion, if I can even get this to work...

The ob­vi­ous op­tion is to start by count­ing ev­ery­thing as a defec­tion and use what­ever time you have re­main­ing (that you are will­ing to spend) adding new things that your code is smart enough to ver­ify as con­di­tional-co­op­er­a­tion. The smarter your code is the more co­op­er­a­tive it can be!

(Worse yet, I’m find­ing that my cur­rent pro­gram may have to rely on fea­tures I want to dis­al­low! Although quite pos­si­bly in a way that won’t get them de­tected as dis­al­lowed...)

There is no strong rea­son the “I have to count this as defec­tion list” you use must be the same as what you ac­tu­ally use in your own code (un­less you think your out-of-game sig­nal­ling is par­tic­u­larly pow­er­ful). (As­sum­ing vaguely sane but bounded com­peti­tors) you want your code to be both as smart as pos­si­ble (al­low­ing more po­ten­tial mu­tual co­op­er­a­tion op­por­tu­ni­ties) and as sim­ple as pos­si­ble (al­low­ing agents as dumb as pos­si­ble to see that they can co­op­er­ate with you). Ideally your agent would be far, far sim­pler to un­der­stand than what it is able to co­op­er­ate with but if you can’t man­age that it isn’t a strict deal breaker.

• There is no strong rea­son the “I have to count this as defec­tion list” you use must be the same as what you ac­tu­ally use in your own code

Cer­tainly true—if it made it so that my pro­gram wouldn’t co­op­er­ate with it­self, that would be a prob­lem, but here the “for­bid­den” stuff is hid­den away an area my pro­gram would never ac­tu­ally an­a­lyze.

The prob­lem is just that, if I need to use this, that sug­gests so will other peo­ple; and that means I’m prob­a­bly go­ing to be defect­ing in a lot of cases where I shouldn’t.

(As­sum­ing vaguely sane but bounded com­peti­tors) you want your code to be both as smart as pos­si­ble (al­low­ing more po­ten­tial mu­tual co­op­er­a­tion op­por­tu­ni­ties) and as sim­ple as pos­si­ble (al­low­ing agents as dumb as pos­si­ble to see that they can co­op­er­ate with you). Ideally your agent would be far, far sim­pler to un­der­stand than what it is able to co­op­er­ate with but if you can’t man­age that it isn’t a strict deal breaker.

Argh, I didn’t even think about the “it should be sim­ple to be visi­bly co­op­er­at­ing” bit. Try­ing to do static anal­y­sis on Scheme (with its conds and cases and quasiquotes) is enough to make my pro­gram com­pli­cated. Maybe I should just sub­mit a MimicBot in­stead.

Btw, just in case any­one was think­ing of do­ing such a thing af­ter the dec­la­ra­tions so far: If you do any re­defin­ing of syn­tax, I’m defect­ing on you.

• Argh, I didn’t even think about the “it should be sim­ple to be visi­bly co­op­er­at­ing” bit. Try­ing to do static anal­y­sis on Scheme (with its conds and cases and quasiquotes) is enough to make my pro­gram com­pli­cated. Maybe I should just sub­mit a MimicBot in­stead.

Does MimicBot run the op­po­nent and then re­turn the same re­sult? If so then I sug­gest a time­out/​defect fea­ture. Any bot that fails against it­self has prob­lems!

Are there go­ing to be Co­op­er­ateBots in the com­pe­ti­tion? If that is a pos­si­bil­ity then con­sider a sim­ple tweak “If my op­po­nent does not make use of its op­po­nent vari­able then defect else Mimic”. (Failing against un­obfus­cated Co­op­er­ateBot is at least as em­bar­rass­ing as failing against self.)

Btw, just in case any­one was think­ing of do­ing such a thing af­ter the dec­la­ra­tions so far: If you do any re­defin­ing of syn­tax, I’m defect­ing on you.

Any­one who tries that must and doesn’t ex­pect defec­tion must se­ri­ously hold the op­po­si­tion in con­tempt!

• Does MimicBot run the op­po­nent and then re­turn the same re­sult? If so then I sug­gest a time­out/​defect fea­ture. Any bot that fails against it­self has prob­lems!

By MimicBot I mean the strat­egy given in this com­ment. It’s not great, but it’s bet­ter than not ever finish­ing.

Are there go­ing to be Co­op­er­ateBots in the com­pe­ti­tion? If that is a pos­si­bil­ity then con­sider a sim­ple tweak “If my op­po­nent does not make use of its op­po­nent vari­able then defect else Mimic”. (Failing against un­obfus­cated Co­op­er­ateBot is at least as em­bar­rass­ing as failing against self.)

Hm, that’s not a bad idea. My cur­rent pro­gram is es­sen­tially try­ing to mea­sure to what ex­tent their out­put de­pends on mine, but if I can’t finish that in time, that’s a sim­ple tweak that can be added...

• At best we’d get a lot more peo­ple who are ex­cited to par­ti­ci­pate but loose in­ter­est once they re­al­ize they have no idea how to write a pro­gram to parse the op­pos­ing bot.

At worst that is what would hap­pen.

Java and Python are many or­ders of mag­ni­tude more pop­u­lar than Scheme and if only 10% of the peo­ple who get ex­cited about par­ti­ci­pat­ing ac­tu­ally know how to parse than we would still have much greater num­bers.

• Java and Python are many or­ders of mag­ni­tude more pop­u­lar than Scheme and if only 10% of the peo­ple who get ex­cited about par­ti­ci­pat­ing ac­tu­ally know how to parse than we would still have much greater num­bers.

Java and Python are much harder to parse than scheme, and the per­centage of Java or Python pro­gramers who could write a parser for those lan­guages is less than 10%. Fur­ther­more, the ones who could write a parser likely also know Scheme or at least some other lisp di­alect.

• Do you have any ev­i­dence of this? It seems highly un­likely that the pro­por­tion of users of any given pro­gram­ming lan­guage that could write a parser would differ by an amount that there are more Scheme users who can write a parser than Java/​Python users, given the vastly larger num­ber of users of Java and Python.

For what its worth, python in­cludes its own parser in the stan­dard library, so it would prob­a­bly be bet­ter than scheme in that re­gard, though I might have to agree with you re­gard­ing pars­ing with Java, though there may very well be a mod­ule out there for pars­ing Java as well.

• From ex­pe­rience, I can tell you that it is eas­ier to lean Scheme and write a Scheme in­ter­preter than it is to just un­der­stand how Python works on the in­side. If you know how Python worked well enough to do static anal­y­sis on it, you can learn Scheme in an hour.

• Java’s re­flec­tion API could con­ceiv­ably be used for this pur­pose. (Pro­vided the pro­grams are given ac­cess to each oth­ers com­piled code rather than to the hu­man-read­able java source code.)

• I’m sorry but re­ally no. The re­flec­tion API al­lows you to see what vari­ables and meth­ods are con­tained, it can al­low you to ex­e­cute meth­ods, but that’s your only op­tion, it doesn’t re­ally al­low you to parse the in­sides of a method.

• I’m try­ing to work out how scheme in gen­eral and racket in par­tic­u­lar work.

I in­stalled racket and started it up, and I get a two-pane win­dow. I’ve no­ticed that us­ing eval in the REPL thing in the bot­tom pane works, but if I try us­ing it in the top half I get a cryp­tic er­ror mes­sage:

#lang racket

(eval ’(+ 1 1))

gives:

+: un­bound iden­ti­fier;

also, no #%app syn­tax trans­former is bound in: +

Is there some ad­di­tional magic, or is this some­thing I don’t need to worry about? How will a pro­gram be run in the com­pe­ti­tion?

• Racket ap­par­ently has ter­rible names­pace con­ven­tions. I had the same prob­lem when I tried to use it to in­ter­pret a file, but when I type a pro­gram di­rectly into the in­ter­preter, eval works fine. I’ll ei­ther figure out how to get eval to work when I run a file, or I’ll just paste peo­ple’s pro­grams into the in­ter­preter. Either way, as­sume eval will work prop­erly.

• The Guide says:

#lang racket

(define ns (make-base-names­pace))
(eval ’(cons 1 2) ns) ; works


and in­deed it works.

• In both #lang racket and #lang scheme, (eval form) by it­self ap­par­ently runs form in a ba­si­cally empty names­pace. I’ve per­son­ally been us­ing the fol­low­ing in­can­ta­tion for eval­u­at­ing any­thing in the “stan­dard” names­pace:

(define (ue­val x)
(define ns (make-base-names­pace))
(eval x ns))


Then (ue­val source-code) eval­u­ates source-code in a clean names­pace with all the nor­mal built­ins, in­clud­ing eval.

• Hm. Five min­utes of thought sug­gest to me that the fol­low­ing pseu­docode would be op­ti­mal:

while(passed.time < 9.5sec­onds) {
out­come = eval(op­po­nent.code(self.code))
if (out­come == “C”) { c.count++ }
if (out­come == “D”) { d.count++ }
}

if (c.count > d.count) {
out­put(“C”)
} else {
out­put(“D”)
}
`

Deter­minis­tic pro­grams would always out­put the same thing so you know be­fore­hand whether they’ll co­op­er­ate or defect. For RNG-based pro­grams you just bet on what’s most likely.

I wel­come big guns blow­ing large holes in this con­struc­tion :-)

P.S. How do I in­dent in pre­for­mat­ted code?

• Based on the fact that you re­tracted this, you prob­a­bly already re­al­ized the prob­lem, but I’ll say it any­way for any­one else who hasn’t no­ticed.

If two pro­grams both use this strat­egy, they’ll go into an in­finite loop where pro­gram A simu­lates pro­gram B simu­lat­ing pro­gram A etc. Since it only checks to see how much time is left af­ter it finishes the simu­la­tion, it won’t even man­age to give up and defect. It will just run out of time.

Also, passed.time would im­ply that there’s a class called “passed” with a vari­able called “time”, which is silly. It should be passed_time or passedTime de­pend­ing on how you do multi-word vari­ables.

• Another prob­lem is that you co­op­er­ate agains Co­op­er­ateBot.

• I didn’t look at the de­tails too much. You can fix that prob­lem just by hav­ing it calcu­late the scores for co­op­er­ate and defect, and go with the one with the higher score.

• I’m not sure what you mean. Do you mean the scores given that you choose to co­op­er­ate and defect? There’s a lot of com­plex­ity hid­ing in ‘given that’, and we don’t un­der­stand a lot of it. This is definitely not a triv­ial fix to Lu­mifer’s pro­gram.

• I messed up. I didn’t re­al­ize that un­til af­ter I posted, and I didn’t feel like go­ing back to fix it.

• If two pro­grams both use this strat­egy, they’ll go into an in­finite loop where pro­gram A simu­lates pro­gram B simu­lat­ing pro­gram A etc.

Not if you do it right. (Not sure if I should say any more. The con­test is in­ter­est­ing but shar­ing too much in the com­ments risks turn­ing it into a so­cial en­g­ineer­ing effort rather than a pro­gram­ming or math­e­mat­i­cal effort.)

• I’m not that fa­mil­iar with Scheme, but is there some way to see how much stack space you have left and stop be­fore an overflow?

• I’m not fa­mil­iar with it ei­ther, but if noth­ing else, you can pass a counter that shows the num­ber of iter­a­tions you’ve gone through and stop when it gets too high.

What do you plan on do­ing when you run out of stack space? You can’t just re­place your code with a co­op­er­ate bot or a defect bot af­ter a cer­tain level, since they’ll defect if they think you’ll do the same thing ei­ther way, and you’ll defect if you think they’ll defect etc.

What you need is some­thing along the lines of co­op­er­at­ing if their code is equiv­a­lent to yours, and defect­ing if it’s differ­ent. You have to do this in un­der ten sec­onds, and you have to get it to co­op­er­ate even if they have a slightly differ­ent idea of what’s equiv­a­lent.

• Scheme re­quires tail-call op­ti­mi­sa­tion, so if you use tail-re­cur­sion then you’ll never overflow.

• Does that in­clude tail-calls in mu­tual re­cur­sion? Even if you were go­ing to re­turn what­ever your op­po­nent’s code did, you prob­a­bly couldn’t count on them do­ing the same.

• Yes: all tail-calls are guarantied not to ex­haust re­sources. The pre­cise defi­ni­tion of what is tail is in the spec.

• Well, that was the be­gin­ning of a host of prob­lems :-)

I don’t know about the thread (or ex­cep­tion han­dling) ca­pa­bil­ities of Scheme in Racket, but it shouldn’t be too hard to make sure you don’t run out of time if some chunk of the code is stuck in a re­cur­sive loop. That’s not re­ally the is­sue. The is­sue is ac­tu­ally the re­cur­sion it­self and the con­se­quent need to deal with op­po­nents that eval your code and ex­pect an an­swer.

As a side note, passed.time is perfectly valid vari­able name in some lan­guages where it doesn’t im­ply a class.method struc­ture.

• Can you track to­tal passed time, so that if your op­po­nent simu­lates you with 9 sec­onds passed, you co­op­er­ate in­stead of simu­lat­ing your op­po­nent? They simu­late you simu­lat­ing them… un­til you say “I’m run­ning out of time, it must be the case that we’re in a re­cur­sive simu­la­tion, so It’s prob­a­bly true that I should co­op­er­ate!” The lat­est simu­la­tion of your op­po­nent says “He co­op­er­ated, so I should!” the next to last simu­la­tion of your pro­gram says the same thing, and so forth.

Un­til your origi­nal pro­gram says “My simu­la­tion of my op­po­nent says he will co­op­er­ate, but I know for sure that I started on the first cy­cle of the eval­u­a­tion, so I’m not a simu­la­tion. I defect.”

• ...so that if your op­po­nent simu­lates you

Your code doesn’t know if it’s be­ing eval­u­ated by the op­po­nent or it’s run­ning “for real”. If it knew, the pro­gram would be re­mark­ably easy: if (simu­lated) { say “I co­op­er­ate!” } else { defect }.

It’s not hard to rec­og­nize that you’re stuck in a re­cur­sive trap and need to get out of there af­ter nine sec­onds, but you still don’t know what to out­put.

• If I know early enough the ten-sec­ond time limit ex­actly how long into the ten sec­onds I am, I can be sure enough that I’m the origi­nal.

Another pos­si­bil­ity would be to in­clude a ran­dom chance on the or­der of .1% that I will set a global flag that will fol­low down lev­els of simu­la­tion, and then simu­late my op­po­nent simu­lat­ing me. If I get the flag, I have high prob­a­bil­ity that my op­po­nent is simu­lat­ing me on him­self; I don’t want to set that flag with high prob­a­bil­ity, be­cause I want my op­po­nents’ (me co­op­er­ate­bot) and (me defect­bot) eval­u­a­tions to re­turn cor­rectly.

But now I’m fo­cus­ing on this par­tic­u­lar im­ple­men­ta­tion of the com­pe­ti­tion, not try­ing to provide use­ful in­for­ma­tion to the ques­tion that the com­pe­ti­tion is about.

Pro­posed meta-rule: Pro­grams should not make at­tempts to de­ter­mine if they are be­ing scored or not in or­der to change be­hav­ior; at­tempts to iden­tify and break out of re­cur­sive loops are en­couraged.

• The in­den­ta­tion thing is a bug.

• I just re­al­ized...

Ac­tu­ally run­ning your op­po­nent, and pass­ing it a fake ver­sion of its own source code is a bad idea-if it no­tices, it can call ei­ther fork­bomb() or loopfor­ever() as needed.

In­ci­den­tially, are you sure you want to run the whole thing as win­ner-take-all? My fish­eries class a while back ran a de­layed-effect pris­on­ers dilemma over the mat­ter of fish sus­tain­abil­ity… ex­cept that it was ac­tu­ally a zero-sum game be­cause there was only ONE ex­tra credit point, with win­ner take all.

Halfway through, af­ter the fish were ex­tinct and my team-which had the most boats-was los­ing due to boat-up­keep costs, I sin­gle­hand­edly turned the whole thing into a game about mar­ket ma­nipu­la­tion by run­ning a pyra­mid scheme in­volv­ing the boats. (I still lost be­cause I for­got to keep ex­po­nen­tially buy­ing more boats with loans from the bank, but oh well...)

Alas, I won’t be com­pet­ing. (and if I do, I’ll just throw out a mimicbot...) Keep it in mind, though, if you set an­other one up.

• Can you ex­plain the game in more de­tail? I don’t un­der­stand what to make of your strat­egy.

• ok.

There were 5 or so teams. there were 10 or 15 “years” (rounds). Each round, you can buy brand new boats in pro­por­tion to your cur­rent fleet, go fish­ing with n boats, buy boats from other play­ers, sell boats to other play­ers...and pay a main­ta­nence fee on your cur­rent fleet size. Each boat that goes fish­ing costs a lit­tle bit more to run than let­ting it sit in one place. The fish re­pro­duce, but slowly.

At the very end of the game, fleet value and bank ac­count are added up to de­ter­mine the win­ner.

But the value of boats is player-driven...and there was no limit on how much debt you could go into. mean­while, the cost for brand-new boats...was con­stant.

The in­struc­tor in­tended to illus­trate how the pris­on­ers dilemma plays a role in overfish­ing, which I don’t think is what we ac­tu­ally wound up learn­ing, since our metagame was win­ner-take-all and not a pris­oner’s dilemma.

...

Again: you might learn some­thing in­ter­est­ing with win­ner-take-all, and may even need to use that as the setup to pre­vent co­op­er­ate-bot spam due to the types of the peo­ple on this site...but it IS a differ­ent metagame. Keep that in mind when in­ter­pret­ing the re­sults.

edit2: By differ­ent metagame, I mean that this tour­na­ment posesses a nash equil­ibrium of pro­grams for which each pro­gram is con­sid­ered a move. Find­ing it is pos­si­ble in finite steps(It does not run afoul of the halt­ing prob­lem due to the 10 sec­ond limi­ta­tion, as mea­sured by the hard­ware of a spe­cific ma­chine and en­vi­ron­ment), but a brute-force ap­proach would prob­a­bly take more mem­ory than our uni­verse ap­pears to pos­sess.