nshepperd

Karma: 2,454

nshepperd 5 Nov 2010 15:54 UTC
3 points
in reply to: mtreder’s comment on: What I would like the SIAI to publish
So what? An agent with a terminal value (building paperclips) is not going to give it up, not for anything. That’s what “terminal value” means. So the AI can reason about human goals and the history of AGI research. That doesn’t mean it has to care. It cares about paperclips.

nshepperd 6 Nov 2010 2:00 UTC
2 points
in reply to: XiXiDu’s comment on: What I would like the SIAI to publish
Exactly? I think we agree about this.

It won’t care unless it’s been programmed to care (for example by adding “spatiotemporal scope boundaries” to its goal system). It’s not going to override a terminal goal, unless it conflicts with a different terminal goal. In the context of an AI that’s been instructed to “build paperclips”, it has no incentive to care about humans, no matter how much “introspection” it does.

If you do program it to care about humans then obviously it will care. It’s my understanding that that is the hard part.

nshepperd 6 Nov 2010 15:05 UTC
1 point
in reply to: humpolec’s comment on: Why should you vote?
To my mind this is the most compelling reason to vote. If you’re rational and you want more rational people to vote, then you should vote, because then they will too (assuming a reasonable number of rational people have similar relevant information to your own).

nshepperd 8 Nov 2010 7:57 UTC
3 points
in reply to: AlexMennen’s comment on: Harry Potter and the Methods of Rationality discussion thread, part 5

and spray transfigured engine exhaust all over the countryside

Presumably since the fuel/engine was transfigured out of water, it should turn into water vapor when the transfiguration runs out. And I wouldn’t expect inhaled engine exhaust to go anywhere water shouldn’t, so that shouldn’t be much of a problem.

nshepperd 8 Nov 2010 10:34 UTC
2 points
in reply to: Eugine_Nier’s comment on: Harry Potter and the Methods of Rationality discussion thread, part 5
Only if how much value people like Harry and Quirrel would assign to her life was a significant factor in her decisionmaking. Chances are that at the time the idea that someone might want to save her from azkaban never even crossed her mind.

nshepperd 10 Nov 2010 3:14 UTC
1 point
in reply to: Perplexed’s comment on: Rationality Quotes: November 2010
“Totally free” sounds like too free. You’re not free to actually decide at time T to “decide X at time T+1″ and then actually decide Y at time T+1, since that is against the laws of physics.

It’s my understanding that what goes through your head when you actually decide X at time T+1 is (approximately) what we call TDT. Or you can stick to CDT and not be able to make decisions for your future self.

nshepperd 10 Nov 2010 20:51 UTC
0 points
in reply to: Thomas’s comment on: A note on the description complexity of physical theories
No, it always splits into two everett branches. It’s just that if you do in fact wake up in the distant future, that version of you that wakes up will be a successor of the you that is awake now, as is the version of you that never went to sleep in the next microsecond (or whatever). And you should anticipate either’s experiences equally.

Or at least that’s how I think it works (this assumes timeless physics, which I think is what Jonii assumed).

nshepperd 17 Nov 2010 3:40 UTC
3 points
in reply to: Perplexed’s comment on: Criticisms of CEV (request for links)

The above is a caricature of ‘coherence’ as presented in the May 2004 document. If someone else can provide a better interpretation, that would be welcome.

That doesn’t sound like how I interpreted ‘coherent’. I assumed it meant a volition the vast majority of humanity agrees with / a measure of how much humanity’s volition agrees. If humanity really didn’t care about death, then that would be a coherent volition. So something like ‘collective’ indeed.

As for extrapolation, it’s not intended to literally look into the future. I thought the example of the diamond in the box was fairly enlightening. The human says ‘I want box 1’, thinking box 1 contains a diamond. The AI knows the diamond is in box 2, and can extrapolate (as humans do) that the human actually wants the diamond and would ask for box 2 if they knew where the diamond was. The smart AI therefore opens box 2, and the human is happy because they have a diamond. A dumb AI would just give the human box 1 “because they asked for it”, even if that’s what they didn’t really want.

When a lot of humans then say “the conquest of death is not a high priority” the AI extrapolates that if we knew more or had basic rationality training we would say conquest of death is a high priority. And therefore goes about solving death.

At least that’s how I understood it.

nshepperd 17 Nov 2010 5:39 UTC
3 points
in reply to: Perplexed’s comment on: Criticisms of CEV (request for links)
It doesn’t seem that scary to me. I don’t see it as substituting “its own judgement” for ours. It doesn’t have a judgement of its own. Rather, it believes (trivially correctly) that if we were wiser, we would be wiser than we are now. And if it can reliably figure out what a wiser version of us would say, it substitutes that person’s judgement for ours.

I suppose I imagine that if told I shouldn’t try to solve death, I would direct the person to LessWrong, try to explain to them the techniques of rationality, refer them to a rationalist dojo, etc. until they’re a good enough rationalist they can avoid reproducing memes they don’t really believe in—then ask them again.

The AI with massively greater resources can of course simulate all this instead, saving a lot of time. And the benefit of the AI’s method is that when the “simulation” says “I wish the AI had started preventing death right away instead of waiting for me to become a rationalist”, The AI can grant this wish!

The AI doesn’t inherently know what’s good or bad. It doesn’t even know what it should be surprised by (only transhumanists seem to realise that “let’s not prevent death” shouldn’t make sense). It can only find out by asking us, and of course the right answer is more likely to be given by a “wise” person. So the best way for the AI to find out what is right or wrong is to make everyone as wise as possible, then ask them (or predict what would happen if it did).

nshepperd 17 Nov 2010 8:08 UTC
3 points
in reply to: red75’s comment on: Criticisms of CEV (request for links)
That’s true.

According to the 2004 paper, Eliezer thinks (or thought, anyway) “what we would decide if we knew more, thought faster, were more the people we wished we were, had grown up farther together...” would do the trick. Presumably that’s the part to be hard-coded in. Or you could extrapolate (using the above) what people would say “wisdom” amounts to and use that instead.

Actually, I can’t imagine someone who knew and understood both the methods of rationality (having been directed to LessWrong) and all the teachings of the church (having been directed to church) would then direct a person to church. Maybe the FAI can let a person take both directions to become wiser.

ETA: Of course, in FAI ‘maybe’ isn’t good enough...

nshepperd 18 Nov 2010 9:43 UTC
1 point
in reply to: wedrifid’s comment on: Luminosity (Twilight fanfic) Part 2 Discussion Thread

My values, transposed into the position of a vampire in Luminosiverse, yes. I would be a predator, choosing among prey of two species that are not my own.

Wait, why should your values change just because you’re suddenly immortal? Or is it because of the magical value of wolves in the luminosiverse? This doesn’t make sense.

nshepperd 18 Nov 2010 12:51 UTC
2 points
in reply to: shokwave’s comment on: Luminosity (Twilight fanfic) Part 2 Discussion Thread
That doesn’t sound too discreet. I suspect the Volturi think humans might notice if they started bombing the forest with napalm. And hiding the vampire world from humans is the main reason the Volturi exist (or why they claim to, anyway).

nshepperd 19 Nov 2010 0:41 UTC
1 point
in reply to: wedrifid’s comment on: Luminosity (Twilight fanfic) Part 2 Discussion Thread
I didn’t think any change would be enough. Isn’t morality subjunctively objective? What doesn’t make sense is that you look like you’re saying wedrifid_vampire’s values are good according to wedrifid_now’s values. If I were expecting to be “turned” I would do everything I can to maintain my current values after the event, because to do otherwise would be against my current values.

And my current values say that if it’s a human or an endangered wolf, I’d save the human.

nshepperd 20 Nov 2010 11:43 UTC
2 points
in reply to: cousin_it’s comment on: Moral Error and Moral Disagreement
A CEV optimizer is less likely to do horrific things while its ability to extrapolate volition is “weak”. If it can’t extrapolate far from the unwise preferences people have now with the resources it has, it will notice that the EV varies a lot among the population, and take no action. Or if the extrapolation system has a bug in it, this will hopefully show up as well. So coherence is a kind of “sanity test”.

That’s one reason that leaps to mind anyway.

Of course the other is that there is no evidence any single human is Friendly anyway, so cooperation would be impossible among EV maximizing AI researchers. As such, an AI that maximizes EV is out of the question already. CEV is the next best thing.

nshepperd 20 Nov 2010 13:14 UTC
2 points
in reply to: Justcallmecynical’s comment on: How to Convince Me That 2 + 2 = 3
One might indeed “believe” all that. But a belief has no use if it isn’t true.

nshepperd 22 Nov 2010 1:50 UTC
0 points
in reply to: Vaniver’s comment on: What I’ve learned from Less Wrong
I don’t think the intention was to offer these as evidence for MWI. The evidence for MWI is that it has one less postulate (and therefore is “simpler”). They’re just showing what MWI rules out. That these predictions are different correctly justifies saying “MWI is not just an interpretation”.

nshepperd 22 Nov 2010 12:19 UTC
0 points
in reply to: Manfred’s comment on: What I’ve learned from Less Wrong
I’m confused as to what you mean by this. The link discusses dissolving the question; isn’t that what Eliezer’s solution did? It feels like the question has been dissolved, anyway.

nshepperd 22 Nov 2010 23:42 UTC
1 point
in reply to: Psy-Kosh’s comment on: Bayesians vs. Barbarians
How about this upper limit: when the outcome of (everyone) following orders would be worse than everyone doing their own thing, disobey.

nshepperd 24 Nov 2010 23:10 UTC
5 points
in reply to: handoflixue’s comment on: Inherited Improbabilities: Transferring the Burden of Proof
In probability, “correlations” are always bidirectional. Bayes theorem:

$P (A | B$ =\frac{P(B|A)P(A)}{P(B)})

If P(B|A) > P(B), then P(A|B) > P(A). By the same factor even:

$\frac{P(A|B\$
}{P(A)}=\frac{P(B|A)}{P(B)})

nshepperd 25 Nov 2010 1:33 UTC
9 points
in reply to: cousin_it’s comment on: Inherited Improbabilities: Transferring the Burden of Proof
Pretty much, I think.

If the prior P(guilty) is 1:1000000 and P(guilty|staged) is really high, a consistent prior requires that P(staged) is around 1:1000000 as well. Therefore 1000:1 evidence isn’t enough.