We Are Less Wrong than E. T. Jaynes on Loss Functions in Human Society

These paragraphs from E. T. Jaynes’s Probability Theory: The Logic of Science (in §13.12.2, “Loss functions in human society”) are fascinating from the perspective of a regular reader of this website:

We note the sharp contrast between the roles of prior probabilities and loss functions in human relations. People with similar prior proabilities get along well together, because they have about the same general view of the world and philosophy of life. People with radically different prior probabilities cannot get along—this has been the root cause of all the religious wars and most of the political repressions throughout history.

Loss functions operate in just the opposite way. People with similar loss functions are after the same thing, and are in contention with each other. People with different loss functions get along well because each is willing to give something the other wants. Amicable trade or business transactions, advantageous to all, are possible only between parties with very different loss functions. We illustrated this by the example of insurance above.

(Jaynes writes in terms of loss functions for which lower values are better, whereas we more often speak of utility functions for which higher values are better, but the choice of convention doesn’t matter—as long as you’re extremely sure which one you’re using.)

The passage is fascinating because the conclusion looks so self-evidently wrong from our perspective. Agents with the same goals are in contention with each other? Agents with different goals get along? What!?

The disagreement stems from a clash of implicit assumptions. On this website, our prototypical agent is the superintelligent paperclip maximizer, with a utility function about the universe—specifically, the number of paperclips in it—not about itself. It doesn’t care who makes the paperclips. It probably doesn’t even need to trade with anyone.

In contrast, although Probability Theory speaks of programming a robot to reason as a rhetorical device[1], this passage seems to suggest that Jaynes hadn’t thought much about how ideal agents might differ from humans? Humans are built to be mostly selfish: we eat to satisfy our own hunger, not as part of some universe-spanning hunger-minimization scheme. Besides being favored by evolution, selfish goals do offer some conveniences of implementation: my own hunger can be computed as a much simpler function of my sense data than someone else’s. If one assumes that all goals are like that, then one reaches Jaynes’s conclusion: agents with similar goal specifications are in conflict, because the specified objective (for food, energy, status, whatever) binds to an agent’s own state, not a world-model.

But … the assumption isn’t true! Not even for humans, really—sometimes people have “similar loss functions” that point to goals outside of themselves, which benefit from more agents having those goals. Jaynes is being silly here.

That said—and no offense—the people who read this website are not E. T. Jaynes; if we can get this one right where he failed, it’s because our subculture happened to inherit an improved prior in at least this one area, not because of our innate brilliance or good sense. Which prompts the question: what other misconceptions might we be harboring, due to insufficiently general implicit assumptions?


  1. ↩︎

    Starting from §1.4, “Introducing the Robot”:

    In order to direct attention to constructive things and away from controversial irrelevancies, we shall invent an imaginary being. Its brain is to be designed by us, so that it reasons according to certain definite rules. These rules will be deduced from simple deciderata which, it appears to us, would be desirable in human brains; i.e. we think that a rational person, on discovering that they were violating one of these deciderata, would wish to revise their thinking.