Drahflow

Karma: 220

Drahflow Apr 21, 2010, 9:47 PM
0 points
in reply to: RobinZ’s comment on: Fusing AI with Superstition
I meant certainly as in “I have an argument for it, so I am certain.”

Claim: Describing some part of space to “contain a human” and its destruction is never harder than describing a goal which will ensure every part of space which “contains a human” is treated in manner X for a non-trivial X (where X will usually be “morally correct”, whatever that means). (Non-trivial X means: Some known action A of the AI exists which will not treat a space volume in manner X).

The assumption that the action A is known is reasonably for the problem of friendly AI, as a sufficiently torturous killing can be constructed for every moral system we might wish to include into the AI, to have the killing labeled immoral.

Proof: Describing destruction of every agent in a certain part of space is easy: Remove all mass and all energy within that part of space. We need to find a way to select those parts of space which “contain a human”. However we have (via the assumption) that our goal function will go to negative infinity when evaluating a plan which treats a volume of space “containing a human” in violation of manner X. Assume for now that we find some way !X to violate manner X for a given space volume. By pushing through the goal evaluation every space volume in existence together with a plan to do !X, we will detect at least those space volumes which “contain a human”.

This leaves us with the problem of defining !X. The assumption as it stands already requires some A which can be used as !X.

Drahflow Apr 21, 2010, 9:14 PM
1 point
in reply to: PhilGoetz’s comment on: Fusing AI with Superstition
How so? The AI lives in a universe where people are planning to fuse AIs in the way described here. Given this website, and the knowledge that one believes that the red wire is magic, there is a high probability that the red wire is fake, and some very small probability that the wire is real. But it is also known for certain that the wire is real. There is not even a contradiction here.

Giving a wrong prior is not the same as walking up to the AI and telling it a lie (which should never raise probability to 1).

Drahflow Apr 21, 2010, 9:08 PM
0 points
in reply to: bogdanb’s comment on: Fusing AI with Superstition
It cannot fix bugs in its priors as for any other part of the system, e.g. sensor drivers, the AI can fix the hell out of itself. Anything which can be fixed is not a true prior though. If we allow the AI to change its prior completely then it is effectively acting upon a prior which does not include any probability 1 entries.

There is no reason to fix the red wire belief if you are certain that it is true. Every evidence is against it, but the red wire does magic with probability 1, hence something is wrong with the evidence (e.g. sensor errors).

Drahflow Apr 21, 2010, 9:02 PM
0 points
in reply to: JamesAndrix’s comment on: Fusing AI with Superstition
I agree. The AI + Fuse System is a deliberately broken AI. In general such an AI will perform suboptimal compared to the AI alone.

If the AI under consideration has a problematic goal though, we actually want the AI to act suboptimal with regards to its goals.

Drahflow Apr 21, 2010, 8:59 PM
1 point
in reply to: bogdanb’s comment on: Fusing AI with Superstition
This is indeed a point I did not consider.

In particular, it might be impossible to construct a simple action description which will fit all of human future. However, it is certainly not harder than to construct a real moral system.

One might get pretty far by eliminating every volume in space (AI excluded) which can learn (some fixed pattern for example) within a certain bounded time, instead of converting DNA into fluorine. It is not clear to me whether this would be possible to describe or not though.

The other option would be to disable the fuse after some fixed time or manually once one has high confidence in the friendliness of the AI. The problems of these approaches are many (although not all problems from the general friendly AI problem carry over).

Drahflow Apr 21, 2010, 8:48 PM
−1 points
in reply to: JamesAndrix’s comment on: Fusing AI with Superstition
There is no hand coded goal in my proposal. I propose to craft the prior, i.e. restrict the worlds the AI can consider possible.

This is the reason both why the procedure is comparatively simple (in comparison with friendly AI) and why the resulting AIs are less powerful.

Drahflow Apr 21, 2010, 5:56 PM
1 point
in reply to: Oscar_Cunningham’s comment on: Fusing AI with Superstition
It might be the case that adding the red wire belief will cripple the AI to a point of total unusability. Whether that is the case can be found out by experiment however.

Adding a fuse as proposed turns an AI which might be friendly or unfriendly into an AI that might be friendly, might spontaneously combust or be stupid.

I prefer the latter kind of AI (even though they need rebuilding more often).

Drahflow Apr 21, 2010, 5:54 PM
2 points
in reply to: JGWeissman’s comment on: Fusing AI with Superstition
War mongering humans are also not particularly useful. In particular they are burning energy like there is no tomorrow for things definitely not paperclippy at all. And you have to spend significant energy resources on stopping them from destroying you.

A paperclip optimizer would at some point turn against humans directly, because humans will turn against the paperclip optimizer if it is too ruthless.

Drahflow Apr 21, 2010, 2:22 PM
2 points
in reply to: Peter_de_Blanc’s comment on: Fusing AI with Superstition
Because broken != totally nonfunctional.

If we have an AI which we believe to be friendly, but can not verify to be so, we add the fuse I described, then start it. As long as the AI does not try to kill humanity or tries to understand the red wire too well, it should operate pretty much like an unmodified AI.

From time to time however it will conclude the wrong things. For example it might waste significant resources on the production of red wires, to conduct various experiments on them. Thus the modified AI is not optimal in our universe, and it contains one known bug. Hence I think it justified to call it broken.

Drahflow Apr 21, 2010, 2:00 PM
0 points
in reply to: NancyLebovitz’s comment on: Fusing AI with Superstition
If the AI is able to question the fact that the red wire is magical, then the prior was less than 1.

It should still be able to reason about hypothetical worlds where the red wire is just a usual copper thingy, but it will always know that those hypothetical worlds are not our world. Because in our world, the red wire is magical.

As long as superstitious knowledge is very specialized, like about the specific red wire, I would hope that the AI can act quite reasonable as long as the specific red wire is not somehow part of the situation.

Drahflow Apr 21, 2010, 1:34 PM
0 points
in reply to: NancyLebovitz’s comment on: Fusing AI with Superstition
I think every AI will need to learn from it’s environment. Thus it will need to update its current believes based upon new information from sensors.

It might conduct an experiment to check whether transmutation at a distance is possible—and find that transmutation at a distance could never be produced.

As the probability that transmutation of human DNA into fluorine is 1, this leaves some other options, like
- the sensor readings are wrong
- the experimental setup is wrong
- it only works in the special case of the red wire
After sufficiently many experiments, the last case will have very high probability.

Which makes me think that maybe, faith is just a numerical inaccuracy.

Drahflow Apr 21, 2010, 12:12 PM
0 points
in reply to: MartinB’s comment on: Fusing AI with Superstition
I do not consider the AI detecting the obvious flaw in its prior to be a problem. Certainly it is advantageous to have such a prior in a universe where the red wire would eliminate humanity. And the probability that the AI is in such a universe is 1 according to the AI. So its prior is just right.

No evidence whatsoever can possibly convince the AI from a universe where the red wire is just a red wire.

We are not telling the AI a lie, we are building an AI that is broken.

Drahflow Apr 21, 2010, 12:09 PM
1 point
in reply to: Jack’s comment on: Fusing AI with Superstition
I absolutely agree. Such suboptimal behavior is to be expected from an AI whose prior makes it impossible to understand the universe correctly.

Nonetheless such an AI could get very intelligent and useful.

Fusing AI with Superstition

DrahflowApr 21, 2010, 11:04 AM

−8 points

77 comments4 min readLW link

Drahflow Mar 25, 2010, 10:04 AM
13 points
on: The “show, don’t tell” nature of argument
- tl;dr: It takes too long

All communication takes time. We should minimize the time necessary to communicate our arguments. As a speaker (or writer for that matter), I cannot know which parts of my argument will be obvious to the listener and which ones won’t.

“It can be argued”/”There is evidence for” should be used whenever the speaker assumes further detailed arguments are unnecessary but would be able to supply further details if requested without undue delays. By stochastically testing whether the speaker actually can supply correct arguments/evidence if requested, we can quickly build trust—saving us a lot of time during later communication with said speaker.

In this way, these phrases could not be used for bluffing disagreeing listeners, because they would simple request more details.

Drahflow Jan 18, 2010, 1:46 PM
13 points
on: Advancing Certainty
You propose to ignore the “odd” errors humans sometimes make while calculating a probability for some event. However, errors do occur, even when judging the very first case. And they (at least some of them) occur randomly. When you believe you have correctly calculated the probability, you just might have made an error anywhere in the calculation.

If you keep around the “socially accepted” levels of confidence, those errors average out pretty fast, but if you make only one error in 10^5 calculations, you should not assign probabilities smaller than 1 / 10^5. Otherwise a bet 10000 to 1 between you and me (a fair game from your perspective) will give me an expected value larger than 0 due to the errors in your thoughts you could possibly make.

This is another advantage an AI might have over humans, if the hardware is good enough, probability assignments below 10^-5 might actually be reasonable.

Drahflow Jan 14, 2010, 9:56 AM
4 points
in reply to: John_Maxwell’s comment on: Self control may be contagious
I can recommend “The Sarah Connor Chronicles” for this purpose. It is longer than a single movie and includes quite a lot of self-controlled characters (above all Cameron of course).

When I saw the series (essentially in one go), the effect described in the study lasted for approximately 2 days.

Drahflow Jan 7, 2010, 6:21 PM
1 point
on: Will reason ever outrun faith?
I agree with the analysis that (2) is the way to go.

Accepting faith as an evolved Memeplex, it is probably well adapted to the communications model of the past few centuries. By intelligent design of some memes however, a determined rationalist community should be able to gain an edge here, as a designed memeplex could be better adapted to the modern media situation.

A concerted effort in this direction should identify common traits of well spreading memes, and design new memes or attach payloads to old ones. Think “saving cute kittens via rationality” on youtube. Depending on the available resources, one might even be able to subdue newly spreading memes with a new variant including the desired payload.

Is anybody aware of data about which memes spread how fast and any kind of analysis about the relevant factors?

Drahflow Nov 2, 2009, 9:44 AM
0 points
in reply to: CronoDAS’s comment on: Our House, My Rules
I voted you up for 2(!) points. Bugs like children.

I find this reduced set of rights particularly problematic as children are quite unbiased by popular political opinion and are able to come up with very novel solutions to hard problems.

Drahflow Aug 26, 2009, 1:07 AM
0 points
on: How does an infovore manage information overload?
I suffer from the same problem but on a lower quality level, I mainly consume news of various kinds. The main problem here is to identify those parts which actually contain information I do not as of yet know.

If some delay is acceptable (as it seems to be the case with your sources) maybe an automated filter based upon citations (or something equivalent) might be possible. Of course you would import all biases of the people who put the citations, then again it might work as a simple crowd-sourcing mechanism.

Drahflow

Fus­ing AI with Superstition

Fusing AI with Superstition