Karl

Karma: 155

Karl 19 Dec 2012 17:10 UTC
11 points
in reply to: Tenek’s comment on: Harry Potter and the Methods of Rationality discussion thread, part 17, chapter 86

For Harry had only loaned his Cloak, not given it

That seems like it answer your question: his invisible copies aren’t borrowing the cloak from him because they are him.

Karl 24 Oct 2014 2:38 UTC
6 points
on: Introducing Corrigibility (an FAI research subfield)
Why not make it so that the agent in selecting A1 act as a UN-agent that believe that it will continue to optimize according to UN even in the event of the button being pressed rather than a UN agent that believe that the button will never be pressed: that is pick U such that

U(a1,o,a2) = UN(a1,o,a2) if o is in Press or US(a1,o,a2) + f(a1,o) - g(a1,o) if o is not in Press

where f(a1,o) is the maximum value of UN(a1,o,b) for b in A2 and g(a1,o) is the maximum value of US(a1,o,b) for b in A2.

This would avoid the perverse manipulation incentives problem detailed on section 4.2 of the paper.

Karl 16 Jul 2013 15:02 UTC
6 points
on: An Attempt at Preference Uncertainty Using VNM
You can’t simply average the km’s. Suppose you estimate .5 probability that k2 should be twice k1 and .5 probability that k1 should be twice k2. Then if you normalize k1 to 1, k2 will average to 1.25, while similarly if you normalize k2 to 1, k1 will average to 1.25.

In general, to each choice of km’s will correspond a utility function and the utility function we should use will be a linear combination of those utility functions and we will have renormalization parameters k’m and, if we accept the argument given in your post, those k’m ought to be just as dependant on your preferences, so you’re probably also uncertain about the values that those parameters should take and so you obtain k″m’s and so on ad infinitum. So you obtain an infinite tower of uncertain parameters and it isn’t obvious how to obtain a utility function out of this mess.

Karl 27 Oct 2010 21:19 UTC
6 points
in reply to: rabidchicken’s comment on: Harry Potter and the Methods of Rationality discussion thread, part 4
A) is very hard to test given the restriction on using magic around muggles. As for B), powerful spells are mostly restricted by the edict of Merlin. C) is, as you pointed out, extremely difficult to research effectively. I’m more surprised that Harry never bothered to ask how new charms are discovered. After all, how are you supposed to figure out that you are supposed to say “Wingardium Leviosa” and then move your wand in a certain way? And he as been told that new charms were discovered every year, so we know it’s possible.

Karl 24 Oct 2014 21:12 UTC
5 points
in reply to: lackofcheese’s comment on: Introducing Corrigibility (an FAI research subfield)
Firstly, the important part of my modification to the indifference formalism is not about conditioning on the actual o but it’s the fact that in evaluating the expectation of UN it take the action in A2 (for a given pair (a1,o)) which maximize UN instead of the action which maximize U (note that U is equal to US in the case that o is not in Press.).

Secondly an agent which chose a1 by simply maximizing E[UN | NotPress; a1] + E[US | Press; a1] do exhibit pathological behaviors. In partcular, there will still be incentives to manage the news, but from both sides now (there is an incentive to cause the button to be pressed in the event of an information which is bad news from the point of view of UN and incentives to cause the button to not be pressed in the events of information which is bad news from the point of view of US.

Karl 15 Jul 2013 3:13 UTC
5 points
in reply to: AlexMennen’s comment on: Robust Cooperation in the Prisoner’s Dilemma
Proof without using Kripke semantic: Let X be a modal agent and Phi(...) it’s associated fully modalized formula. Then if PA was inconsistent Phi(...) would reduce to a truth value independent of X opponent and so X would play the same move against both FairBot and UnfairBot (and this is provable in PA). But PA cannot prove it’s own consistency so PA cannot both prove X(FairBot) = C and X(UnfairBot) = D and so we can’t both have FairBot(X) = C and UnfairBot(X) = C. QED

Karl 8 Mar 2013 14:56 UTC
5 points
in reply to: Luke_A_Somers’s comment on: A problem with “playing chicken with the universe” as an approach to UDT
To quote step 2 of the original algorithm:

For every possible action a, find some utility value u such that S proves that A()=a ⇒ U()=u. If such a proof cannot be found for some a, break down and cry because the universe is unfair.

Karl 24 Jul 2013 23:39 UTC
4 points
in reply to: cousin_it’s comment on: An argument against indirect normativity
By that term I simply mean Eliezer’s idea that the correct decision theory ought to use a maximization vantage points with a no-blackmail equilibrium.

Karl 16 Jul 2013 16:48 UTC
4 points
in reply to: [deleted]’s comment on: An Attempt at Preference Uncertainty Using VNM

Hmm. I’ll have to take a closer look at that. You mean that the uncertainties are correlated, right?

No. To quote your own post:

A similar process allows us to arbitrarily set exactly one of the km.

I meant that the utility function resulting from averaging over your uncertainty over the km’s will depend on which km you chose to arbitrarily set in this way. I gave an example of this phenomenon in my original comment.

Karl 14 Jul 2013 0:18 UTC
4 points
in reply to: AlexMennen’s comment on: Robust Cooperation in the Prisoner’s Dilemma
UnfairBot defect against PrudentBot.

Proof: For UnfairBot to cooperate with PrudentBot, PA would have to prove that PrudentBot defect against UnfairBot which would require PA to prove that “PA does not prove that UnfairBot cooperate with PrudentBot or PA+1 does not prove that UnfairBot defect against DefectBot” but that would require PA to prove it’s own consistency which it cannot do. QED

Karl 10 Jul 2013 22:08 UTC
4 points
on: Robust Cooperation in the Prisoner’s Dilemma
Here is another obstacle to an optimality result: define UnfairBot as the agent which cooperate with X if and only if PA prove that X defect against UnfairBot, then no modal agent can get both FairBot and UnfairBot to cooperate with it.

Karl 11 Dec 2012 3:52 UTC
4 points
in reply to: Eliezer Yudkowsky’s comment on: By Which It May Be Judged

Also I totally think there was a respectable hard problem

So you do have a solution to the problem?

Karl 24 Oct 2014 4:00 UTC
3 points
in reply to: interstice’s comment on: Introducing Corrigibility (an FAI research subfield)
In the indifference formalism the agent in selecting A1 act like a UN agent that believe that the shutdown button will not be pressed, therefore it create perverse incentives to “manage the news”. Which means that if the agent can cause his shutdown button to be pressed in the event of bad news, it will.

My formulation avoid this pathological behavior by instead making the agent select A1 as if it was a UN-agent which believed that it would continue to optimize according to UN even in the vent of the button being pressed which avoid the perverse incentives to “manage the news”, while still not having any incentives to avoid the button being pressed because the agent will act like it believe that pressing the button will not cause it to initiate a shutdown.

Karl 15 Jul 2013 3:24 UTC
3 points
in reply to: fractalman’s comment on: Robust Cooperation in the Prisoner’s Dilemma
What do you even mean by “is a possible outcome” here? Do you mean that there is no proof in PA of the negation of the proposition?

The formula of a modal agent must be fully modalized, which means that all propositions containing references to actions of agents within the formula must be within the scope of a provability operator.

Karl 23 Dec 2012 21:44 UTC
3 points
in reply to: Eliezer Yudkowsky’s comment on: Harry Potter and the Methods of Rationality discussion thread, part 18, chapter 87
Timeless physics predate Barbour.

Karl 29 Aug 2011 17:22 UTC
3 points
in reply to: JoshuaZ’s comment on: Harry Potter and the Methods of Rationality discussion thread, part 8
I don’t think the part about summoning Death is a reference to anything. After all, we already know what the incarnations of Death are in MOR. And it looks like the conterspell to dismiss Death is lost no more thanks to Harry...

Karl 11 Nov 2013 19:23 UTC
2 points
on: Reduced impact AI: no back channels
Apart from the obvious problems with this approach, (The AI can do a lot with the output channel other than what you wanted it to do, choosing an appropriate value for λ, etc.) I don’t see why this approach would be any easier to implement than CEV.

Once you know what a bounded approximation of an ideal algorithm is supposed to look like, how the bounded version is supposed to reason about it’s idealised version and how to refer to arbitrary physical data, as the algorithm defined in your post assume, then implementing CEV really doesn’t seem to be that hard of a problem.

So could you explain why you believe that implementing CEV would be so much harder than what you propose in your post?

Karl 14 Jul 2013 13:18 UTC
2 points
in reply to: AlexMennen’s comment on: Robust Cooperation in the Prisoner’s Dilemma
Proof: Let X be a modal agent, Phi(...) it’s associated fully modalized formula, (K, R) a GL Kripke model and w minimal in K. Then, for all statement of the form ◻(...) we have w |- ◻(...) so Phi(...) reduce in w to a truth value which is independent of X opponent. As a result, we can’t have both w |- X(FairBot) = C and w |- X(UnfairBot) = D and so we can’t have both ◻(X(FairBot) = C) and ◻(X(UnfairBot) = D) and so we can’t both have FairBot(X) = C and UnfairBot(X) = C. QED

Karl 14 Jul 2013 2:53 UTC
2 points
in reply to: fractalman’s comment on: Robust Cooperation in the Prisoner’s Dilemma

no modal agent can get both FairBot and UnfairBot to cooperate with it.

TrollDetector is not a modal agent.

Karl 11 Jun 2013 2:09 UTC
2 points
in reply to: Will_Sawin’s comment on: Robust Cooperation in the Prisoner’s Dilemma
Both of those agents are modal agents of rank 0 and so the fact that they defect against CooperateBot imply that FairBot defects against them by theorem 4.1.