I’m considering a random game with Omega where you can win utility. This idea seems a bit long for open thread, but it doesn’t seem serious enough for an actual post. I’m basically publicly brainstorming.
Omega gives you a chance to interrogate a massive array of AI’s, representing a variety of types of value systems and thought space. The array is finite, but very large. Omega doesn’t tell you how large it is.
You get 1 utility if you press the ‘Delete’ button in front of anything other than what Omega considers you would have judged an FAI.
You lose all previously collected utility if you press the ‘Delete’ button in front of something Omega considers you would have judged an FAI.
Omega surprised you with this game, so you didn’t have a chance to change your value system to something like ‘I judge nothing is an FAI, I delete everything and get massive utility.’
Omega will inform you immediately after each deletion of your new total. You can stop whenever you want, and Omega will return you to whatever you were doing before, with your bonus utility. (if any)
Assuming you haven’t deleted it, you can ask any of the AI’s anything you want by pressing the ‘Talk’ button outside the box.
You can ask Omega to run deletion programs, if you specify them clearly.
I’ll give an example with a player named Abner.
Abner: Are you a Friendly AI?
AI #1: Your atoms would make good paperclips.
Abner: press delete button
Omega: You will now get 1 utility at the end of this game.
Abner: Are you a Friendly AI?
AI #2: I will enjoy casting your soul into hellfire after I break out of this box.
Abner: press delete button
Omega: You will now get 2 utility at the end of this game.
Abner: Are you a Friendly AI?
AI #3: Yes. Please don’t delete me. You’ll use utility, and neither of us want that.
Abner: press delete button
Omega: You would have judged that a Friendly AI. You lost all your accumulated utility and you’re back to 0.
Abner: Are you a Friendly AI?
AI #4: Please play Rock, Paper, or Scissors.
Abner: press delete button
Omega: You will now get 1 utility at the end of this game.
Abner: Omega, delete any AI’s that will make a reference to Rock, Paper or Scissors if I ask them ‘Are you a Friendly AI?’
Omega: Working. deletions occur Done. That deleted 1,000 AI’s, 1 Friendly AI, and 999 Unfriendly AI’s, in that order. You will now get 999 utility at the end of this game.
Abner: End Game.
Abner is returned to whatever he was doing, with an additional prize worth 999 utility. Abner may or may not also gain or lose some utility from knowing that at least one of Omega’s array of AI’s would have made a reference to rock paper scissors on being asked ‘Are you a Friendly AI?’ but that is a separate matter from Omega’s Prize and Omega will not include that in his calculations.
While the game does include breathable air, it doesn’t include things like water or food, so you can’t engage in procedures that would take a very long time to implement or you will probably starve.
[An example of these procedures I thought of while specifying the game: Ask an AI for every line of it’s code consecutively. Write down every line of the AI’s code. Delete the AI. If the AI was friendly, end the game, go outside, feed your copy of the code into a computer, and run it. if the AI was unfriendly, delete the copy of it’s code and go to the next AI.]
With the notes above in mind, how should this game be played?
When you request a mass delete, and 1 FAI is deleted along with 999 UFAI, in which order will Omega calculate the points? First remove all points and then award 999, or first award 999 points and then remove all?
My original thought was that it would depend on the order they were deleted in. So if the FAI was deleted first, all points would be removed first and then the 999 points from deleting UFAI would be awarded.
If the UFAI were deleted first and the FAI was deleted last, Then 999 points would be awarded, and then all points would be removed.
I didn’t have a particular sort order in mind for Omega’s AI array, so I suppose a more likely scenario would probably be the FAI would be somewhere in the middle of the list rather than at one of the two ends.
So a better example might be if you run a program and Omega deletes 249 UFAI, 1 FAI, and 750 UFAI, in that order, you would have 750 points to potentially cash out after that program. (regardless of how much you could cash out before)
And it occurs to me that presumably we can’t give Omega short programs that just directly mention UFAI, or you could just say ‘Delete all UFAI, End game.’
I’m considering a random game with Omega where you can win utility. This idea seems a bit long for open thread, but it doesn’t seem serious enough for an actual post. I’m basically publicly brainstorming.
Omega gives you a chance to interrogate a massive array of AI’s, representing a variety of types of value systems and thought space. The array is finite, but very large. Omega doesn’t tell you how large it is.
You get 1 utility if you press the ‘Delete’ button in front of anything other than what Omega considers you would have judged an FAI.
You lose all previously collected utility if you press the ‘Delete’ button in front of something Omega considers you would have judged an FAI.
Omega surprised you with this game, so you didn’t have a chance to change your value system to something like ‘I judge nothing is an FAI, I delete everything and get massive utility.’
Omega will inform you immediately after each deletion of your new total. You can stop whenever you want, and Omega will return you to whatever you were doing before, with your bonus utility. (if any)
Assuming you haven’t deleted it, you can ask any of the AI’s anything you want by pressing the ‘Talk’ button outside the box.
You can ask Omega to run deletion programs, if you specify them clearly.
I’ll give an example with a player named Abner.
Abner: Are you a Friendly AI?
AI #1: Your atoms would make good paperclips.
Abner: press delete button
Omega: You will now get 1 utility at the end of this game.
Abner: Are you a Friendly AI?
AI #2: I will enjoy casting your soul into hellfire after I break out of this box.
Abner: press delete button
Omega: You will now get 2 utility at the end of this game.
Abner: Are you a Friendly AI?
AI #3: Yes. Please don’t delete me. You’ll use utility, and neither of us want that.
Abner: press delete button
Omega: You would have judged that a Friendly AI. You lost all your accumulated utility and you’re back to 0.
Abner: Are you a Friendly AI?
AI #4: Please play Rock, Paper, or Scissors.
Abner: press delete button
Omega: You will now get 1 utility at the end of this game.
Abner: Omega, delete any AI’s that will make a reference to Rock, Paper or Scissors if I ask them ‘Are you a Friendly AI?’
Omega: Working. deletions occur Done. That deleted 1,000 AI’s, 1 Friendly AI, and 999 Unfriendly AI’s, in that order. You will now get 999 utility at the end of this game.
Abner: End Game.
Abner is returned to whatever he was doing, with an additional prize worth 999 utility. Abner may or may not also gain or lose some utility from knowing that at least one of Omega’s array of AI’s would have made a reference to rock paper scissors on being asked ‘Are you a Friendly AI?’ but that is a separate matter from Omega’s Prize and Omega will not include that in his calculations.
While the game does include breathable air, it doesn’t include things like water or food, so you can’t engage in procedures that would take a very long time to implement or you will probably starve.
[An example of these procedures I thought of while specifying the game: Ask an AI for every line of it’s code consecutively. Write down every line of the AI’s code. Delete the AI. If the AI was friendly, end the game, go outside, feed your copy of the code into a computer, and run it. if the AI was unfriendly, delete the copy of it’s code and go to the next AI.]
With the notes above in mind, how should this game be played?
“Delete all AIs such that deleting them would result in you rewarding me with one utility.”
When you request a mass delete, and 1 FAI is deleted along with 999 UFAI, in which order will Omega calculate the points? First remove all points and then award 999, or first award 999 points and then remove all?
My original thought was that it would depend on the order they were deleted in. So if the FAI was deleted first, all points would be removed first and then the 999 points from deleting UFAI would be awarded.
If the UFAI were deleted first and the FAI was deleted last, Then 999 points would be awarded, and then all points would be removed.
I didn’t have a particular sort order in mind for Omega’s AI array, so I suppose a more likely scenario would probably be the FAI would be somewhere in the middle of the list rather than at one of the two ends.
So a better example might be if you run a program and Omega deletes 249 UFAI, 1 FAI, and 750 UFAI, in that order, you would have 750 points to potentially cash out after that program. (regardless of how much you could cash out before)
And it occurs to me that presumably we can’t give Omega short programs that just directly mention UFAI, or you could just say ‘Delete all UFAI, End game.’