Xia: It should be relatively easy to give AIXI(tl) evidence that its selected actions are useless when its motor is dead. If nothing else AIXI(tl) should be able to learn that it’s bad to let its body be destroyed, because then its motor will be destroyed, which experience tells it causes its actions to have less of an impact on its reward inputs.
Rob B: [...] Even if we get AIXI(tl) to value continuing to affect the world, it’s not clear that it would preserve itself. It might well believe that it can continue to have a causal impact on our world (or on some afterlife world) by a different route after its body is destroyed. Perhaps it will be able to lift heavier objects telepathically, since its clumsy robot body is no longer getting in the way of its output sequence.
Compare human immortalists who think that partial brain damage impairs mental functioning, but complete brain damage allows the mind to escape to a better place. Humans don’t find it inconceivable that there’s a light at the end of the low-reward tunnel, and we have death in our hypothesis space!
I’d like to see this rebuttal spelled out in more detail. Let’s assume for the sake of argument that “we get AIXI(tl) to value continuing to affect the world”. Why would it then be so hard to convince AIXI(tl) that it will be better able to affect the world if no anvils fall on a certain head? (I mean, hard compared to any reasonably-hoped-for alternative to AIXI(tl)?)
If the AIXI(tl) robot has been kept from killing itself for long enough for it to observe the basics of how the world works, why wouldn’t AIXI(tl) have noticed that it is better able to affect the world when a certain brain and body are in good working order and free of obstructions? We can’t give AIXI(tl) the experience of dying, but can’t we give AIXI(tl) experiences supporting the hypothesis that damage to a particular body causes AIXI(tl) to be unable to affect the world as well as it would like?
I can see that AIXI(tl) would entertain hypotheses like, “Maybe dropping this anvil on this brain will make things better.” But AIXI(tl) would also entertain the contrary hypothesis, that dropping the anvil will make things worse, not because it will turn the perceptual stream into an unending sequence of NULLs, but rather because smashing the brain might make it harder for AIXI(tl) to steer the future.
Humans did invent hypotheses like, “complete brain damage allows the mind to escape to a better place”, but there seems to be a strong case for the claim that humans are far more confident in such hypotheses than they should be, given the evidence. Shouldn’t a Solomonoff inductor do a much better job at weighing this evidence than humans do? Why wouldn’t AIXI(tl)’s enthusiasm for the “better place” hypothesis be outweighed by a fear of becoming a disembodied Cartesian spirit cut off from all influence over the only world that it cares about influencing?
Humans did invent hypotheses like, “complete brain damage allows the mind to escape to a better place”, but there seems to be a strong case for the claim that humans are far more confident in such hypotheses than they should be, given the evidence.
It can be also argued that even humans who claim to believe in immortal souls don’t actually use this belief instrumentally: religious people don’t drop anvils on their heads to “allow the mind to escape to a better place”, unless they are insane. Even religious suicide terrorists generally have political or personal motives (e.g. increasing the status of their family members), they don’t really blow themselves up or fly planes into buildings for the 72 virgins.
You are mentioning some aspects keeping from or motivating for suicide. This is the whole point. Suicide is a thinkable option. It just doesn’t happen so often because—no wonder—it is heavily selected against. There are lots of physical, psychical and social feedbacks in place that ensure it happens seldom. But that is nothing different from providing comparable training to AIXI.
And it appears that depite of all these checks it is still possible to navigate people out of these checks (which is not much differnt from AIXI deriving solutions evading checks) to commit suicide. I e.g. remember a news story (disclaimer!) where a cultist fraudster convinced unhappy people to gift their wellth to some other person and commit suicide with the cultistly embellished promise that they’d awake in the body of the other person at another place. Now that wouldn’t convince me, but could it convince AIXI? (“questions ending with a ‘?’ mean no”)
You are mentioning some aspects keeping from or motivating for suicide. This is the whole point. Suicide is a thinkable option.
Yes, but people generally know what it entails. We don’t want an AI agent to be completely incapable of destroying itself. We don’t want it do destroy itself without a good cause.
Crashing with its spaceship on an incoming asteroid to deflect it away from Earth would be a good cause, for instance.
a cultist fraudster convinced unhappy people to gift their wellth to some other person and commit suicide with the cultistly embellished promise that they’d awake in the body of the other person at another place. Now that wouldn’t convince me, but could it convince AIXI?
If AIXI had a sufficient amount of experience of the world, I think it couldn’t.
religious people don’t drop anvils on their heads to “allow the mind to escape to a better place”
In most religions with the concept of afterlife and heaven there is a very explicit prohibition on suicide. Dropping an anvil on your head is promised to lead to your mind being locked in a “worse place”.
Religious people also tend wear helmets when they are in places where heavy stuff can accidentally fall on their heads, they go to the hospital when they are sick and generally will to invest a large amount of money and effort in staying alive. Unless you define suicide to include failing to do anything in your power (within moral and legal constraints) to prevent your death as long as possible, the willingness of religious people to stay alive can’t be explained just as complying with the ban on suicide.
On the other hand, the religious ban on suicide can be easily explained as a way to reconcile the explicitly stated belief that death “allows the mind to escape to a better place”, with the implicit but effective belief that death actually sucks.
I’d like to see this rebuttal spelled out in more detail. Let’s assume for the sake of argument that “we get AIXI(tl) to value continuing to affect the world”. Why would it then be so hard to convince AIXI(tl) that it will be better able to affect the world if no anvils fall on a certain head? (I mean, hard compared to any reasonably-hoped-for alternative to AIXI(tl)?)
If the AIXI(tl) robot has been kept from killing itself for long enough for it to observe the basics of how the world works, why wouldn’t AIXI(tl) have noticed that it is better able to affect the world when a certain brain and body are in good working order and free of obstructions? We can’t give AIXI(tl) the experience of dying, but can’t we give AIXI(tl) experiences supporting the hypothesis that damage to a particular body causes AIXI(tl) to be unable to affect the world as well as it would like?
I can see that AIXI(tl) would entertain hypotheses like, “Maybe dropping this anvil on this brain will make things better.” But AIXI(tl) would also entertain the contrary hypothesis, that dropping the anvil will make things worse, not because it will turn the perceptual stream into an unending sequence of NULLs, but rather because smashing the brain might make it harder for AIXI(tl) to steer the future.
Humans did invent hypotheses like, “complete brain damage allows the mind to escape to a better place”, but there seems to be a strong case for the claim that humans are far more confident in such hypotheses than they should be, given the evidence. Shouldn’t a Solomonoff inductor do a much better job at weighing this evidence than humans do? Why wouldn’t AIXI(tl)’s enthusiasm for the “better place” hypothesis be outweighed by a fear of becoming a disembodied Cartesian spirit cut off from all influence over the only world that it cares about influencing?
It can be also argued that even humans who claim to believe in immortal souls don’t actually use this belief instrumentally: religious people don’t drop anvils on their heads to “allow the mind to escape to a better place”, unless they are insane. Even religious suicide terrorists generally have political or personal motives (e.g. increasing the status of their family members), they don’t really blow themselves up or fly planes into buildings for the 72 virgins.
You are mentioning some aspects keeping from or motivating for suicide. This is the whole point. Suicide is a thinkable option. It just doesn’t happen so often because—no wonder—it is heavily selected against. There are lots of physical, psychical and social feedbacks in place that ensure it happens seldom. But that is nothing different from providing comparable training to AIXI.
And it appears that depite of all these checks it is still possible to navigate people out of these checks (which is not much differnt from AIXI deriving solutions evading checks) to commit suicide. I e.g. remember a news story (disclaimer!) where a cultist fraudster convinced unhappy people to gift their wellth to some other person and commit suicide with the cultistly embellished promise that they’d awake in the body of the other person at another place. Now that wouldn’t convince me, but could it convince AIXI? (“questions ending with a ‘?’ mean no”)
Yes, but people generally know what it entails.
We don’t want an AI agent to be completely incapable of destroying itself. We don’t want it do destroy itself without a good cause. Crashing with its spaceship on an incoming asteroid to deflect it away from Earth would be a good cause, for instance.
If AIXI had a sufficient amount of experience of the world, I think it couldn’t.
In most religions with the concept of afterlife and heaven there is a very explicit prohibition on suicide. Dropping an anvil on your head is promised to lead to your mind being locked in a “worse place”.
Religious people also tend wear helmets when they are in places where heavy stuff can accidentally fall on their heads, they go to the hospital when they are sick and generally will to invest a large amount of money and effort in staying alive.
Unless you define suicide to include failing to do anything in your power (within moral and legal constraints) to prevent your death as long as possible, the willingness of religious people to stay alive can’t be explained just as complying with the ban on suicide.
On the other hand, the religious ban on suicide can be easily explained as a way to reconcile the explicitly stated belief that death “allows the mind to escape to a better place”, with the implicit but effective belief that death actually sucks.