FAWS comments on Contrived infinite-torture scenarios: July 2010

FAWS 25 Jul 2010 12:10 UTC
2 points
0
I said as many times, not as much as possible. The AI might value that particular kind and degree of annoyance uniquely, say as a failed FAI that was programmed to maximize rich, not strongly negative human experience according to some screwed up definition of rich experiences, and according to this definition your state of mind between reading and replying to that message scores best, so the AI spends as many computational resources as possible on simulating you reacting to that message.

Or perhaps it was supposed to value telling the truth to humans, there is a complicated formula for evaluating the value of each statement, due to human error it values telling the truth without being believed higher (the programmer thought non-obvious truths are more valuable), and simulating you reacting to that statement is the most efficient way to make a high scoring true statement that will not be believed.

Or it could value something else entirely that’s just not obvious to a human. There should be an infinite number of non-contradictory utility functions valuing doing what it supposedly did, even though the prior for most of them is pretty low (and only a small fraction of them should value still simulating you now, so by now you can be even more sure the original statement was wrong than you could be then for reasons unrelated to your deduction)