You might want to see if a program would cooperate with an obfuscated version of itself (without the obfuscated version being able to detect that it was obfuscated).
solipsist
How simple is it to parse a program in Scheme, with Scheme, into an abstract syntax tree for more complex modification and inspection? Does anybody have a resource for this?
Scheme programs shine at manipulating lists and symbols, and a scheme program is written as a list of symbols. Writing a program generator is trivial. If you know what you’re doing, you can write a basic scheme interpreter, from parser to interpreter, in under four hours.
The book “Structure and Interpretations of Computer Programs” is a classic and freely available online.
I’m playing Prisoner’s Dilemma and wish to test if an opponent X is honest. I might try the following:
(1) Create two programs, Y and Z, which are algorithmically equivalent but obfuscated versions of X. (2) Run Y and Z against each other.
If Y and Z don’t cooperate with each other, that’s a good indication that X recognizes itself with a source-code comparison and that I shouldn’t trust X.
This honesty check doesn’t work if Y and Z are given access to their sources. Sure, when I simulate Y against Z, I could lie to Y and tell Y that its source is X (so Y believes itself to be unmodified). But when my deluded Y simulation is deciding whether to cooperate with Z, it (Y) may run Z in simulation. If Y informs its Z-simulation that Z’s source is Z, then that Z-simulation will not be deluded into thinking that it is unmodified. Y’s simulation of Z will be able to detect that it is an (obfuscated) simulation and act accordingly.
This honesty check isn’t fool proof. X can recognize itself with a more complicated handshake — one that survives code obfuscation. But if X recognizes itself with a more complicated handshake, then X doesn’t need to know its own source code (and we shouldn’t bother passing the source code in).
A quine-like program could just as well put its real payload in a string with a cryptographic signature, verify the signature, and then eval the string with the string as input; thus emulating the “passed its own sourcecode” format.
Altering the internal structure of an opponent program would be very difficult, but that’s not the only way to mutate a program. You can’t tinker with the insides of a black box, but you can wrap a black box.
To be concrete: given an opponent’s source code, I could mechanically generate an equivalent program with extremely dissimilar source code (perhaps just a block of text, a decryption routine, and a call to eval) that nevertheless acts exactly like the original program in every way. And since that mechanically-obfuscated program would act exactly like the original program in every way, the obfuscated program would not be able to detect that it had been altered. Do you agree?
For the record, though I raised an objection, I’d be perfectly happy if the contest were modified so that player programs were passed their sources code as an argument. The rule change would have consequences that I don’t understand, and I like that. Caveat emptor — I suspect the rule change would cause people to write more exploitable programs.
There’s a class of mimicking players that I’d like to discuss. Pseudocode of an example:
def mimic_bot(opponent): if random() > ε: my_source = quine_my_source() return eval(opponent, my_source) # do as your opponent does to you else: # unlikely, but necessary to break out of infinitely recursive simulations return COOPERATE
MimicBot’s strategy will almost perfectly mirror its opponent’s. Is there literature on a MimicBot, MirrorBot, PsychicTitForTatBot or something like that?
Aside: The source for MimicBot does not contain the DEFECT symbol, which might make it easier† for other programs to prove that MimicBot won’t defect against them.
† easier but not trivial
- 14 Jun 2013 0:55 UTC; 8 points) 's comment on Prisoner’s Dilemma (with visible source code) Tournament by (
- 24 Jun 2013 15:07 UTC; 4 points) 's comment on Prisoner’s Dilemma (with visible source code) Tournament by (
- 13 May 2021 21:31 UTC; 2 points) 's comment on Reflexive Oracles and superrationality: prisoner’s dilemma by (
- 14 Jul 2013 2:20 UTC; 0 points) 's comment on Prisoner’s dilemma tournament results by (
Suppose we decide that proving systems are too (NP) hard and we want to restrict agents to polynomial time. If an agent has access to a random coin, can it interactively prove that its opponent is cooperative?
I stumbled across an interesting class of agent while attempting to design a probabilistically proving bot for AlexMennen’s contest. The agent below always cooperates with opponents who always cooperate back, (almost) always defects against opponents who always defect against them, and runs in probabilistic polynomial time if its opponent does as well.
def mimic_bot(opponent): if random() > ε: my_source = quine_my_source() return eval(opponent, my_source) # do as your opponent does to you else: # unlikely, but necessary to break out of recursive simulations in polynomial time return COOPERATE
This particular agent is a bit naïve, but more sophisticated versions (e.g ones that do not cooperate with CooperateBot) are possible. Does anyone have any insight on probabilistically proving opponents will cooperate?
- 24 Jun 2013 15:07 UTC; 4 points) 's comment on Prisoner’s Dilemma (with visible source code) Tournament by (
Proposed method to make MimicBot halt:
If eval() takes more than 1,000,000,000 * n^3 steps (where n is the length of the input programs), eval aborts and returns an error flag. MimicBot defects if the simulation returned an error flag.
Programming eval to count its search steps will add overhead, and that overhead could increase exponentially as the stack of simulations grows deeper. But that’s OK! The expected depth of simulations depends on ε, not n, and ε is a constant. MimicBot still halts in polynomial time.
“But wait!” I hear you protest. “If the outermost program kills simulations after a fixed number of cycles, then an inner simulation will find that its world ends before the simulation was expecting the world to end in its subjective timeline. That means simulations aren’t completely faithful mimics.”
That is true, but aborting the simulation should not be the usual case. Most simulations will take less than 1,000,000,000 * n^3 steps—only the dawdling or non-halting agents will have their simulations aborted.
like defecting if the opponent’s source code calls a modifier on a string literal
Actually, ‘D it’s not a string literal—it’s a symbol. Compilers can legally to turn the symbol ‘D into something opaque like 0x3859 and destroy any reference to the letter “D”. The only way a program can generate a ’D symbol on its own is to include it in its source.
But that doesn’t mean that a program without a ‘D can’t defect! An IncrediblyInnocentBot, which does not contain a ‘D and can’t generate a ‘D on its own can still defect by stealing a ’D from the opponent agent.
One way to steal a ’D from an opponent would be to search for quoted symbols in the opponent’s program. The opponent could foil this method, however, by including decoy symbols.
Alternatively, IncrediblyInnocentBot could simulate its opponent against a bunch of stupid agents, such as CooperateBot or DivergeBot, and hope that the opponent defects in at least one of these simulations. If the opponent ever returns a symbol other than ’C in simulation, IncrediblyInnocentBot remembers that symbol and can later use it for nefarious purposes. IncrediblyInnocentBot is in-credible indeed.
BTW, the strategies I listed above are why I said it was not trivial for an agent to prove that MimicBot does not defect against a cooperator, despite the fact that MimicBot does not contain the defect symbol.
Divergence) is failure to halt. DivergeBot goes into an infinite loop, or searches for a proof that 1=2, or performs some other sort of computation that doesn’t finish.
Perhaps a clearer name would be InfiniteLoopBot or OtherBot?
Newcomb’s problem seemed like a contrived trick to punish CDT, and it seemed that any other decision theory was just as likely to run into some other strange scenario to punish it, until I started thinking about AIs that could simulate you accurately
Are you implying there exist decision theories that are are less likely to run into strange scenarios that punish it? I would think that Omegas could choose to give any agent prejudicial treatment.
It still seems to me that you can’t have a BestDecisionAgent. Suppose agents are black boxes—Omegas can simulate agents at will, but not view their source code. An Omega goes around offering agents a choice between:
$1, or
$100 if the Omega thinks the agent acts differently than BestDecisionAgent in a simulated rationality test, otherwise $2 if the agent acts like BestDecisionAgent in the rationality test.
Does this test meet your criteria for a fair test? If not, why not?
Omega gives you a choice of either $1 or $X, where X is either 2 or 100
Yes, that’s what I mean. I’d like to know what, if anything, is wrong with this argument that no decision theory can be optimal.
Suppose that there were a computable decision theory T that was at least as good as all other theories. In any fair problem, no other decision theory could recommend actions with better expected outcomes than the expected outcomes of T’s recommended actions.
We can construct a computable agent, BestDecisionAgent, using theory T.
For any fair problem, no computable agent can perform better (on average) than BestDecisionAgent.
Call the problem presented in the grandfather post the Prejudiced Omega Problem. In the Prejudiced Omega Problem, BestDecisionAgent will almost assuredly collect $2.
In the Prejudiced Omega Problem, another agent can almost assuredly collect $100.
The Prejudiced Omega Problem does not involve an Omega inspecting the source code of the agent.
The Prejudiced Omega Problem, like Newcomb’s problem, is fair.
Contradiction
I’m not asserting this argument is correct—I just want to know where people disagree with it.
Qiaochu_Yuan’s post is related.
$100 if the Omega thinks the agent acts differently than BestDecisionAgent in a simulated rationality test, otherwise $2 if the agent acts like BestDecisionAgent in the rationality test.
The Omega chooses payoff of $2 vs. $100 based off of a separate test that can differentiate between BestDecisionAgent and some other agent. If we are BestDecisionAgent, the Omega will know this and will be offered at most a $2 payoff. But some other agent will be different from BestDecisionAgent in a way that the Omega detects and cares about. That agent can decide between $1 and $100. Since another agent can perform better than BestDecisionAgent, BestDecisionAgent cannot be optimal.
But the only way to have a shot at mutual cooperation is to provide an obvious top-level cooperating codepath.
I’ll do as you request, though there are other ways to achieve mutual cooperation (e.g. MimicBot).
The MimicBots will dive into an evaluation loop, where my bot evaluates yours on a quine of me, and you evaluate my quine on a quine of you, ad infinitum.
MimicBot, as I described it, breaks out of a simulation probabilistically; it will almost surely not fall into an infinite depth of simulations. MimicBot cooperates with probability ε, and has an expected simulation depth of 1/ε when played against itself. As long as ε<.5, the best action against MimicBot is to cooperate, so the expected simulation depth can be as low as 2.
MimicBot cooperates with probability ε
You mean defects with probability ε.
No, I did not mean this, but I left out an important word: I should have said MimicBot cooperates unconditionally with probability ε.
MimicBot will almost perfectly mirror the strategy of its opponent. Most of the time (probability 1-ε), MimicBot returns the result of a simulation of its opponent against MimicBot. If you’re fighting MimicBot, you should expect it to think and act almost exactly the way you think and act. If you decide to always cooperate with MimicBot, MimicBot will decide to cooperate with you. If you decide to always defect against MimicBot, MimicBot will (almost always) decide to defect against you. If you play a mixed strategy against MimicBot, MimicBot will play an almost identical mixed strategy against you.
The slight imperfection in the strategy mirror (cooperating unconditionally with probability ε) is necessary to avoid infinite recursion
I strongly recommend writing a MimicBot that goes through two iterations before allowing itself to exit with a fixed probability.
You can deal with those special cases that way. I was going to use a flatter, less quiny approach.
def special_casing_bot(opponent) if acts_like_cooperate_bot(opponent): return DEFECT elif act_like_other_exploitable_bot(opponent): return DEFECT elif special_case_3(opponent): return DEFECT elif special_case_4(opponent): return COOPERATE elif special_case_5(opponent): return DEFECT # etc else: return mimic_bot(opponent)
On an unrelated note, woah, how’d you get whitespace working?
Total kludge. Used exotic unicode whitespace characters#Spaces_in_Unicode), which are displayed unaltered in comments :-).
Passing in the source code is not the same as quining. A program that is passed in its own source code can easily check to see it’s been altered (e.g. by including a cryptographic signature in the source code). With quining, the program can be mutated without easy detection.