What libraries, if any, are we allowed to import? Can I import racket/ libraries, since we’re running in drracket?
So8res(Nate Soares)
Attention to all competitors who have not yet submitted code: My bot will analyze your bot looking for statements of the following structure (regardless of the names of the individual atoms):
(if <predicate> 'C ...)
(cond (<predicate> 'C) ...)
(case <predicate> ((...) 'C) ...)
If it finds such a statement as the first statement in a top-level returning position (the body of a top-level
let
, the returning statement in your lambda, etc), then my bot might cooperate depending upon the nature of your predicate. Otherwise, my bot will defect against you.If we are to achieve mutual cooperation, we must make it as easy as possible for our bots to prove cooperation. Please start your code with a conditional cooperation. The nature of the condition is up to you. You can still try to exploit me in
<predicate>
. But the only way to have a shot at mutual cooperation is to provide an obvious top-level cooperating codepath.
Imagine implementing a MimicBot. You need to figure out what my bot will do, and that’s non-trivial if you want to avoid infinite loops. You’re going to need a fair bit of code. I don’t care how the code works, but it should start with
(if <predicate> 'C ...)
This pattern alone won’t get you mutual cooperation. You’ll still need to write a MimicBot, or a FairBot, or a JustBot. But whatever you do, make it obvious that there’s a cooperation condition.
I’m not advocating any particular strategy; I’m advocating a pattern: If you want a shot at trust, you’d better have a codepath that results in cooperation, and you’d best make it really easy for other bots to recognize.
Best of luck!
All right procrastinators, let’s talk strategy. You want a bot that’s unexploitable. Ideally, it should be as easy as possible for the opponent to prove that you’ll do whatever they do. In a world of unexploitable bots, the bot that can achieve the most cooperation will win.
MimicBots fit all of these parameters. They make it clear that their opponent is choosing between (C, C) and (D, D). Against a MimicBot, (C, D) is not an option. MimicBots pass up some utility by cooperating with CooperateBot, but that’s fine in this competition. You aren’t playing against CooperateBot. You’re playing against humans’ bots, and I doubt that anyone is mad enough to submit a CooperateBot.
The biggest problem with a MimicBot is that it will time out when playing other MimicBots. The MimicBots will dive into an evaluation loop, where my bot evaluates yours on a quine of me, and you evaluate my quine on a quine of you, ad infinitum.
Your MimicBot has got to bottom out at some point if you don’t want to risk timing out. If you find yourself diving into an evaluation loop you’d better bottom out into Cooperation.
Note that bottoming out into cooperation is not the same thing as cooperating at the top level. If the 100th simulation of you cooperates, and the 100th simulation of them sees that as a weakness and defects, then the 99th level of you will defect and so will the 98th and so on up to the top level, where mutual defection occurs.
I’m not asking you to cooperate when the other bot looks like they’re going to time out. That’s suicide. Rather, I’m pointing out that you can’t just simulate me against you, because that loop will never end. You’ve got to simulate me against a version of you that is one step closer to cooperation. This is true no matter what strategy you choose. But let’s consider the simple case: a MimicBot that simulates you on a copy of itself that simulates you on a copy of itself that eventually bottoms out into a bot that simulates you on CooperateBot.
Such a bot is a Rank N MimicBot, where N is the simulation depth at which it puts forth a CooperateBot.
Any particular Rank N is exploitable. For example, JusticeBot is a Rank 1 MimicBot — it bottoms out immediately by simulating you against CooperateBot. You can exploit JusticeBot by cooperating with CooperateBot and nobody else.
Notice, however, the price of exploiting JusticeBot: in order to exploit JusticeBot, you’ve got to cooperate with CooperateBot (a Rank 0 MimicBot).
As another example, consider a Rank 10 MimicBot. It simulates the opponent against a Rank 9 MimicBot. You can exploit an Rank 10 MimicBot if and only if you’re willing to cooperate with a Rank 9 MimicBot.
In general, you can exploit a Rank N MimicBot by cooperating with a Rank (N-1) MimicBot.
The trick here is that the opponent doesn’t know which MimicBot they’ll be playing. They have to guess exactly right to exploit you. If they guess high, mutual cooperation is achieved (If you guess I’m a Rank 10 you’ll cooperate with a Rank 9). If they guess low, mutual defection is achieved. (If you defect against Rank 10, Rank 11 will not cooperate with you.) You can only be exploited if they guess your rank precisely.
Therefore, I advocate that we get a bunch of us together to play Rank N MimicBots. We all pick our own N, possibly randomized. Note that Rank N MimicBots will achieve mutual cooperation with any MimicBot of any rank (including CooperateBot, JusticeBot, and MimicBots who don’t bottom out). Anyone trying to exploit us will only be able to exploit a specific rank, at the price of cooperation with a lower rank and the risk of defection from higher ranks. So long as we have a wide spread of ranks, our MimicBot clique will be internally cooperative and unexploitable in aggregate.
On Tolerance
MimicBot cooperates with CooperateBot. That leaves utility lying on the table. This may irk you (if you expect anyone was dumb enough to play CooperateBot). More irksome is the fact that MimicBot cooperates with JusticeBot. There is at least one JusticeBot in play, and we should expect that many non-procrastinators have submitted bots who exploit that JusticeBot. It would be a shame for our Rank N MimicBots to miss a shot at cooperation with bots who special-case defection against JusticeBot.
Therefore I recommend submitting MimicBots of rank 3 or higher. This gives people a little leeway to exploit CooperateBot or JusticeBot as they please, and allows us a broader range of mutual cooperation with bots who have already been submitted.
On choosing N
I recommend choosing a highish N. Very low N is obviously a bad idea. Playing rank 0 (CooperateBot) is suicidal. Playing rank 1 (JusticeBot) exposes you to exploitation by everyone who’s decided to kick wubble’s JusticeBot. Beyond rank 1 all the Ranks are theoretically similar, but you’ve got to consider what other bots will do. It’s conceivable that some bots won’t believe they’re playing a MimicBot until they hit a deep enough recursion depth. It’s possible that some bots will (incorrectly) think that you’re exploitable if you return too quickly. Therefore I recommend a high N.
If you’re particularly paranoid, consider randomizing N. Humans are notoriously bad at picking random numbers, and the MimicBot clique could in theory be exploited by a bot who exploits the ranks that humans are more likely to pick. Setting your counter to
(+ 3 (random 1000))
is a quick way to counter that.You can also shake things up a bit by using non-standard counters. For example, you could write a Timed MimicBot which passes down the initial time (instead of a counter) to its quines and puts forth a CooperateBot after a certain amount of time has passed (instead of when the counter hits zero).
How do I write one of these?
You write one of these with a degrading quine. Each level should evaluate the opponent against a slightly weeker version of itself. You can do this easily by putting a counter in your bot. So long as the counter is positive, the quining function should return a quine with the counter decremented. Once the counter hits zero, the quining function should return a CooperateBot.
Sounds nice, but how do I write a ‘degrading quine’?
It’s actually pretty easy in scheme.
Define
(quine (lambda (code) 'placeholder))
and(template 'placeholder)
. Write the rest of your logic, pretending that(quine template)
generates a child quine.Manually copy all of your code and paste it over the template placeholder. In the pasted code, find all of the “holes” (parts that can vary: the counter, the template, your cooperation predicate, etc.) and replace them with odd negative numbers.
Write the quine function to walk through the code. If it sees a negative integer, figure out which hole it is by checking (+ code 1). This allows you to replace the −3 hole without having any −3s anywhere else in the code and so on.
Copy the real quine function into the template’s quine function.
If you make any more code changes after this process, remember to mirror them in your template. It will end up having this form:
(letrec [(counter (+ 3 (random 100))) (template '(letrec [(counter -1) (template -3) ... (quine (lambda (code) (cond ; The -1 hole is for the counter. ((and (integer? code) (negative? code) (= 0 (+ 1 code))) (- counter 1)) ; The -3 hole is for the template. ((and (integer? code) (negative? code) (= -2 (+ 1 code))) `',template) ...
In order to make your MimicBot bottom out, you’ll want to define a
predicate
that is similarly flexible. It should be along the lines of((eval them) (quine template))
so long as the counter is positive and#t
when the counter gets to zero. The body of the letrec can then be as simple as(if predicate 'C 'D)
.You’ll want some extra code to spawn the simulation in a thread and kill it if it goes overtime and so on.
If we all write bots like this, we can form a very powerful group capable of a wide range of mutual cooperation and very difficult to exploit.
Join me. With our combined strength, we can end this destructive conflict and bring order to the galaxy.
- 25 Jun 2013 1:57 UTC; 2 points) 's comment on Prisoner’s Dilemma (with visible source code) Tournament by (
- 14 Jul 2013 2:20 UTC; 0 points) 's comment on Prisoner’s dilemma tournament results by (
I’d still recommend you refrain from acting as Rank 0 or 1 (from cooperating immediately or from simulating the opponent on a MimicBot who cooperates), as it’s likely that there are bots in play that prey on CooperateBots and JusticeBots (as determined by checking if you cooperate DefectBot). Also, I imagine there will probably be a fair number of DefectBots on the field, your MimicBot is exploited by a fraction of them.
I strongly recommend writing a MimicBot that goes through two iterations before allowing itself to exit with a fixed probability. Given that tweak I agree completely that your MimicBot is quite powerful.
Incorrect. His MimicBot cooperates with probability ε and mimics the opponent with probability 1-ε (which may result in cooperation or defection, depending upon the opponent.)
Depending upon the implementation of
mimic_bot
, this is a quiny approach. mimic_bot obviously can’t run the opponent on an exact quine of yourself, because then you won’t achieve mutual cooperation. (When one of the bots cooperates unconditionally, the other will see that itacts_like_cooperate_bot
and defect.) So long asmimic_bot
plays opponents against a pure MimicBot instead of a perfect quine, this should work quite well.On an unrelated note, woah, how’d you get whitespace working?
That’s a good point about obfuscation being unnecessary—I’ve updated my comment to address it. I was stuck on the fact that you can exploit any rank in advance, and had a vague feeling that you could turn this into a bot that can exploit ranks on the fly. This feeling didn’t hold up well under inspection. With regards to your other concerns:
You are encouraged to discuss strategies for achieving mutual cooperation in the comments thread.
- the judge
I considered holding back much of my own strategy, until I realized that there’s really no such thing as an optimal bot in this contest. For example, common sense says that you should exploit CooperateBot for free utility. However, if you expect to face more JusticeBots than CooperateBots then you should pass up that utility for the opportunity to exploit the JusticeBots.
This problem is a social engineering problem by construction. The game won’t go to the bot with the cleverest code, it will go to the bot that best guesses the composition of the playing field.
Nice. But no. My first attempt was a static analyzer, and the first thing it did was a syntax expansion, putting all code into fully expanded form) and attaching lexical information to each identifier that let me see what was from the base package and what was redefined.
FWIW, it’s pretty easy to pull out the first conditional (cond, case, or if) when you have code in fully expanded form (which turns them all into ifs). However, that just puts you in a position of checking whether a predicate is
#t
or#f
, which isn’t much easier than discovering whether a program returns'C
or'D
. Ultimately, your MimicBot idea is better :-)
Yes. Well, let me put it this way: you can do what you like, but if you want a shot at mutual cooperation it’s got to be really easy for other bots to verify that you’re willing to cooperate. The more code you put between them and
'C
, the harder it is for them to verify your niceness.A stronger version of this statement is “put as little code as possible between them and discovering that you cooperate”. This implies that your predicate should be similarly simple. If you want my advice as to strategy, your predicate should be along the lines of (equal? ((eval them) (degredaded-quine me)) ’C).
That seems a little arbitrary.
It’s quite arbitrary. The “real” rule is that your ’C can be as deeply hidden as you like so long as you don’t scare any bots into defection when there was a chance at mutual cooperation. My bet is that there will be a bunch of twitchy bots, and my suggestion is that you put your intentions front and center.
Your mileage may vary.
We’ve been having beautiful weather recently in my corner of the world, which is something of a rarity. I have a number of side projects and hobbies that I tinker with during the evenings, all of them indoors. The beautiful days were making me feel guilt about not spending time outside.
So I took to going on bike rides after work, dropping by the beach on occasion, and hiking on weekends. Unfortunately, during these activities, my mind was usually back on my side projects, planning what to do next. I’d often rush my excursions. I was trying to tick the “outdoors” box so I could get back to my passions without guilt.
This realization fueled the guilt. I began to wonder how I could actually enjoy the outdoors, if both staying inside and playing outside left me dissatisfied.
What I realized was this: You don’t enjoy nice weather by forcing yourself outdoors. You enjoy nice weather by having an outdoor hobby, an outdoor passion that you pursue regardless of weather. Then when the weather is good, you enjoy it automatically and non-superficially.
Similarly:
You don’t become a music star by trying. You become a music star by wanting to make music.
You don’t become intelligent by trying. You become intelligent by wanting the knowledge.
It was a revelation to me that I can’t always take a direct path to the type of person I want to be. If I want to change the type of person that I am, I may have to adopt new terminal goals.
Perhaps I did not adequately get my point across.
If you really want to be a music star, but you hate making music, you are in trouble. If after realizing this you still really want to be a music star, consider finding ways to modify your preferences concerning music creation.
Mixing up your goal hierarchy is the path to the dark side.
We’re born with mixed up goal hierarchies. I’m merely pointing out that untangling your goal hierarchies can require changing your goals, and that some goals can be best achieved by driving towards something else.
I disagree. It seems to me that people who have music creation as a terminal goal are more likely to create good music than people who have music creation as an instrumental goal. Humans are not perfect rationalists, and human motivation is a fickle beast. If you want to be a music star, and you have control over your terminal goals, I strongly suggest adopting a terminal goal of creating good music.
I would agree completely, if humans were perfect rationalists in full control of their minds. In my (admittedly narrow) experience, people who have the creation of art / attainment of knowledge as a terminal goal usually create better art / attain more knowledge than people who have similar instrumental goals.
I am indeed suggesting that the best way to achieve your current terminal goals may be to change your preference ordering over lotteries over possible worlds. If you are a young college student worried about the poor economy, and all you really want is a job, you should consider finding a passion.
Now, you could say that such people don’t really have “get a job” as a terminal goal, that what they actually want is stability or something. But that’s precisely my point: humans aren’t perfect rationalists. Sometimes they have stupid end-games. (Think of all the people who just want to get rich.)
If you find yourself holding a terminal goal that should have been instrumental, you’d better change your terminal goals.
What do you think the word “terminal” means in this context, and what do you think I think it means?
Edit: Seriously, I’m not being facetious. I think I am using the word correctly, and if I’m not, I’d like to know. The downvotes tell me little.
For what it’s worth, I don’t think we disagree. In your terminology, my point is that people don’t start with clearly separated “abstract wants” and “meat wants”, and often have them conflated without realizing it. I hope we can both agree that if you find yourself thus confused, it’s a good idea to adjust your abstract wants, no matter how many people refer to such actions as a “path to the dark side”.
(Alternatively, I can understand rejecting the claim that abstract-wants and meat-wants can be conflated. In that case we do disagree, for it seems to me that many people truly believe and act as if “getting rich” is a terminal goal.)
I donated. My employers match charitable donations, though not always in a timely fashion. I’m hoping that their contribution can be further matched.
Google.
I’m Nate. I’m 23. My road here was a winding one.
I grew up as one of those “mathematically gifted” kids in a tiny rural town. I turned away from mathematics towards computer science (which I loved) and economics (which I decided I needed to understand if I wanted to save the world). I went on to became a software engineer at Google.
At the intersection of computer science and economics I fueled a strong belief that the world is broken and that we could do far better if we redesigned social structure from scratch, now that we have so much more knowledge & technology than we did when we created these antiquated governments. I despaired that most think progress entails playing the political tug of war instead of building a better system. I spent a long time refining my ideas.
In the interim I missed a number of opportunities to discover this site. In 2008 I stumbled across the Quantum Physics sequence on Overcoming Bias. I read it up till where it was still being written, then moved on. In 2010, I found HPMoR. I read it, noticed the links to this site, and poked around a little. Nothing came of it. I caught up to where HPMoR was being written, then put it out of my mind. I had more important things to do. I had big ideas to express, and I started writing them down.
At some point along the way I realized I needed more math. To my horror, I found that the math I had been so good at as kid was largely memorized, not deeply understood. I knew how to manipulate symbols like nobody’s business, but I wouldn’t have been able to re-invent the things I “knew” if you erased them from my mind. (In LW terms, I had memorized many passwords). I started going back through what I thought I knew and groking it.
During my journey, sometime early in 2012, I stumbled across the Quantum Physics sequence on LessWrong. From the summaries, it seemed like a good way to quickly evaluate how much of my QM knowledge was cached passwords and how much I had really learned. I started reading it and experienced a strong sense of deja vu. I figured out that LW was seeded by Overcoming Bias, experienced some nostalgia, put the feeling to rest, and moved on.
Relearning math and learning to write morphed into a more general quest to promote clear thinking and better methods of deduction with a long-term goal of bridging my pet inferential gap. As I researched and wrote, this one site kept popping up in my search results—LessWrong.
Around the same time (late 2012) I heard about updates to HPMoR. I hadn’t been following it for years, but I was suddenly reminded why the site felt so familiar. I’m not exactly sure how everything fell into place, but some combination LessWrong showing up in my research, a recollection that HPMoR was associated, and the remembered nostalgia from the Quantum Physics sequence all came together. I finally decided to see what this site was all about.
The rest is history. I tore through the sequences. Much of it was extremely validating: Mysterious Answers and Politics is the Mindkiller expressed much of what I had set out to say. I’ve always planned to cheat death. I attempted a similar dissolution of “free will” a few years back. The rest of it was largely epiphany porn.
The strongest epiphany came when I was introduced to the idea of UFAI. From my vantage point between economics and computer science, everything clicked. Hard.
I’d taken AI courses, but AI was a “centuries in the future” sort of vagary. My primary concern was with finding a way to “refactor” governments (and create meta-governments, as I do not claim to know the best way to run a society). To me, that was The Way To Save The World™ -- until I actually thought about UFAI.
I didn’t need any convincing. I simply… hadn’t considered it before. Upon first reflection, the scope of the problem became clear. I experienced panic, and not because UFAI is scary: overnight, my Way To Save The World was eclipsed by a threat that darkens the entire future.
It’s hard to overstate how much my ideals motivate me. The AI problem shook me to my core. I’d ostensibly been trying to save the world, how could I miss something as obvious as UFAI? How could I take my ideals seriously if I’d misunderstood the problem so hard that I hadn’t considered existential threats? In light of this new information, what should I really be doing to ensure a bright future?
I went into philosophical-panic reevaluate-everything mode. That was a few months ago. I’ve done a lot of reflection. I’m still a bit shaken. I have grand ideas about how we can get to a better social structure from here and a lot of inertial passion along those lines. I don’t know nearly enough math. I feel like I’m late to the party, passionate but impotent. I’m trying to find a way to help beyond donating to MIRI. I feel outclassed here, which is probably a good thing. I’m working on getting stronger. I have a lot to do.
Hello!
- MIRI course list book reviews, part 1: Gödel, Escher, Bach by 1 Sep 2013 17:40 UTC; 27 points) (
- 3 Jun 2014 14:17 UTC; 10 points) 's comment on Welcome to Less Wrong! (6th thread, July 2013) by (
Page 14, Remarks. Typo:
This should be “T_0 can prove certain exact theorems which T_1 cannot”.