The Amanda Knox Test: How an Hour on the Internet Beats a Year in the Courtroom

Note: The quantitative elements of this post have now been revised significantly.

Followup to: You Be the Jury: Survey on a Current Event

All three of them clearly killed her. The jury clearly believed so as well which strengthens my argument. They spent months examining the case, so the idea that a few minutes of internet research makes [other commenters] certain they’re wrong seems laughable

- lordweiner27, commenting on my previous post

The short answer: it’s very much like how a few minutes of philosophical reflection trump a few millennia of human cultural tradition.

Wielding the Sword of Bayes—or for that matter the Razor of Occam—requires courage and a certain kind of ruthlessness. You have to be willing to cut your way through vast quantities of noise and focus in like a laser on the signal.

But the tools of rationality are extremely powerful if you know how to use them.

Rationality is not easy for humans. Our brains were optimized to arrive at correct conclusions about the world only insofar as that was a necessary byproduct of being optimized to pass the genetic material that made them on to the next generation. If you’ve been reading Less Wrong for any significant length of time, you probably know this by now. In fact, around here this is almost a banality—a cached thought. “We get it,” you may be tempted to say. “So stop signaling your tribal allegiance to this website and move on to some new, nontrivial meta-insight.”

But this is one of those things that truly do bear repeating, over and over again, almost at every opportunity. You really can’t hear it enough. It has consequences, you see. The most important of which is: if you only do what feels epistemically “natural” all the time, you’re going to be, well, wrong. And probably not just “sooner or later”, either. Chances are, you’re going to be wrong quite a lot.

To borrow a Yudkowskian turn of phrase: if you don’t ever—or indeed often—find yourself needing to zig when, not only other people, but all kinds of internal “voices” in your mind are loudly shouting for you to zag, then you’re either a native rationalist—a born Bayesian, who should perhaps be deducing general relativity from the fall of an apple any minute now—or else you’re simply not trying hard enough.

Oh, and another one of those consequences of humans’ not being instinctively rational?

Two intelligent young people with previously bright futures, named Amanda and Raffaele, are now seven days into spending the next quarter-century of their lives behind bars for a crime they almost certainly did not commit.

“Almost certainly” really doesn’t quite capture it. In my previous post I asked readers to assign probabilities to the following propositions:

1. Amanda Knox is guilty (of killing Meredith Kercher)
2. Raffaele Sollecito is guilty (of killing Meredith Kercher)
3. Rudy Guédé is guilty (of killing Meredith Kercher)

I also asked them to guess at how closely they thought their estimates would match mine.

Well, for comparison, here are mine (revised):

1. Negligible. Small. Hardly different from the prior, which is dominated by the probability that someone in whatever reference class you would have put Amanda into on January 1, 2007 would commit murder within twelve months. Something on the order of 0.001 0.01 or 0.1 at most.
2. Ditto.
3. About as high as the other two numbers are low. 0.999 0.99 as a (probably weak) lower bound.

Yes, you read that correctly. In my opinion, there is for all intents and purposes zero Bayesian evidence that Amanda and Raffaele are guilty. Needless to say, this differs markedly from the consensus of the jury in Perugia, Italy.

How could this be?

Am I really suggesting that the estimates of eight jurors—among whom two professional judges—who heard the case for a year, along with something like 60% of the Italian public and probably half the Internet (and a significantly larger fraction of the non-American Internet), could be off by a minimum of three orders of magnitude (probably significantly more) such a large amount? That most other people (including most commenters on my last post) are off by no fewer than two?

Well, dear reader, before getting too incredulous, consider this. How about averaging the probabilities all those folks would assign to the proposition that Jesus of Nazareth rose from the dead, and calling that number x. Meanwhile, let y be the correct rational probability that Jesus rose from the dead, given the information available to us.

How big do you suppose the ratio x/​y is?

Anyone want to take a stab at guessing the logarithm of that number?

Compared to the probability that Jesus rose from the dead, my estimate of Amanda Knox’s culpability makes it look like I think she’s as guilty as sin itself.

And that, of course, is just the central one of many sub-claims of the hugely complex yet widely believed proposition that Christianity is true. There are any number of other equally unlikely assertions that Amanda would have heard at mass on the day after being found guilty of killing her new friend Meredith (source in Italian) -- assertions that are assigned non-negligible probability by no fewer than a couple billion of the Earth’s human inhabitants.

I say this by way of preamble: be very wary of trusting in the rationality of your fellow humans, when you have serious reasons to doubt their conclusions.

The Lawfulness of Murder: Inference Proceeds Backward, from Crime to Suspect

We live in a lawful universe. Every event that happens in this world—including human actions and thoughts—is ultimately governed by the laws of physics, which are exceptionless.

Murder may be highly illegal, but from the standpoint of physics, it’s as lawful as everything else. Every physical interaction, including a homicide, leaves traces—changes in the environment that constitute information about what took place.

Such information, however, is—crucially—local. The further away in space and time you move from the event, the less entanglement there is between your environment and that of the event, and thus the more difficult it is to make legitimate inferences about the event. The signal-to-noise ratio decreases dramatically as you move away in causal distance from the event. After all, the hypothesis space of possible causal chains of length n leading to the event increases exponentially in n.

By far the most important evidence in a murder investigation will therefore be the evidence that is the closest to the crime itself—evidence on and around the victim, as well as details stored in the brains of people who were present during the act. Less important will be evidence obtained from persons and objects a short distance away from the crime scene; and the importance decays rapidly from there as you move further out.

It follows that you cannot possibly expect to reliably arrive at the correct answer by starting a few steps removed in the causal chain, say with a person you find “suspicious” for some reason, and working forward to come up with a plausible scenario for how the crime was committed. That would be privileging the hypothesis. Instead, you have to start from the actual crime scene, or as close to it as you can get, and work backward, letting yourself be blown by the winds of evidence toward one or more possible suspects.

In the Meredith Kercher case, the winds of evidence blow with something like hurricane force in the direction of Rudy Guédé. After the murder, Kercher’s bedroom was filled with evidence of Guédé′s presence; his DNA was found not only on top of but actually inside her body. That’s about as close to the crime as it gets. At the same time, no remotely similarly incriminating genetic material was found from anyone else—in particular, there were no traces of the presence of either Amanda Knox or Raffaele Sollecito in the room (and no, the supposed Sollecito DNA on Meredith’s bra clasp just plain does not count—nor, while we’re at it, do the 100 picograms [about one human cell’s worth] of DNA from Meredith allegedly on the tip of a knife handled by Knox, found at Sollecito’s apartment after the two were already suspects; these two things constituting pretty much the entirety of the physical “evidence” against the couple).

If, up to this point, the police had reasons to be suspicious of Knox, Sollecito, and Guédé, they should have cleared Knox and Sollecito at once upon the discovery that Guédé -- who, by the way, was the only one to have fled the country after the crime—was the one whom the DNA matched. Unless, that is, Knox and Sollecito were specifically implicated by Guédé; after all, maybe Knox and Sollecito didn’t actually kill the victim, but instead maybe they paid Guédé to do so, or were otherwise involved in a conspiracy with him. But the prior probabilities of such scenarios are low, even in general—to say nothing of the case of Knox and Sollecito specifically, who, tabloid press to the contrary, are known to have had utterly benign dispositions prior to these events, and no reason to want Meredith Kercher dead.

If Amanda Knox and Raffaele Sollecito were to be in investigators’ thoughts at all, they had to get there via Guédé -- because otherwise the hypothesis (a priori unlikely) of their having had homicidal intent toward Kercher would be entirely superfluous in explaining the chain of events that led to her death. The trail of evidence had led to Guédé, and therefore necessarily had to proceed from him; to follow any other path would be to fatally sidetrack the investigation, and virtually to guarantee serious—very serious—error. Which is exactly what happened.

There was in fact no inferential path from Guédé to Knox or Sollecito. He never implicated either of the two until long after the event; around the time of his apprehension, he specifically denied that Knox had been in the room. Meanwhile, it remains entirely unclear that he and Sollecito had ever even met.

The hypotheses of Knox’s and Sollecito’s guilt are thus seen to be completely unnecessary, doing no explanatory work with respect to Kercher’s death. They are nothing but extremely burdensome details.

Epistemic Ruthlessness: Following the Strong Signal

All of the “evidence” you’ve heard against Knox and Sollecito—the changing stories, suspicious behavior, short phone calls, washing machine rumors, etc. -- is, quite literally, just noise.

But it sounds so suspicious, you say. Who places a three-second phone call?

As humans, we are programmed to think that the most important kinds of facts about the world are mental and social—facts about what humans are thinking and planning, particularly as regards to other humans. This explains why some people are capable of wondering whether the presence of (only) Rudy Guédé′s DNA in and on Meredith’s body should be balanced against the possibilty that Meredith may have been annoyed at Amanda for bringing home boyfriends and occasionally forgetting to flush the toilet—that might have led to resentment on Amanda’s part, you see.

That’s an extreme example, of course—certainly no one here fell into that kind of trap. But at least one of the most thoughtful commenters was severely bothered by the length of Amanda’s phone calls to Meredith. As—I’ll confess—was I, for a minute or two.

I don’t know why Amanda wouldn’t have waited longer for Meredith to pick up. (For what it’s worth, I myself have sometimes, in a state of nervousness, dialed someone’s number, quickly changed my mind, then dialed again a short time later.) But—as counterintuitive as it may seem—it doesn’t matter. The error here is even asking a question about Amanda’s motivations when you haven’t established an evidentiary (and that means physical) trail leading from Meredith’s body to Amanda’s brain. (Or even more to the point, when you have established a trail that led decisively elsewhere.)

Maybe it’s “unlikely” that Amanda would have behaved this way if she were innocent. But is the degree of improbabilty here anything like the improbability of her having participated in a sex-orgy-killing without leaving a single piece of physical evidence behind? While someone else left all kinds of traces? When you had no reason to suspect her at all without looking a good distance outside Meredith’s room, far away from the important evidence?

It’s not even remotely comparable.

Think about what you’re doing here: you are invoking the hypothesis that Amanda Knox is guilty of murder in order to explain the fact that she hung up the phone after three seconds. (Remember, the evidence against Guédé is such that the hypothesis of her guilt is superfluous—not needed—in explaining the death of Meredith Kercher!)

Maybe that’s not quite as bad as invoking a superintelligent deity in order to explain life on Earth; but it’s the same kind of mistake: explaining a strange thing by postulating a far, far stranger thing.

“But come on,” says a voice in your head. “Does this really sound like the behavior of an innocent person?”

You have to shut that voice out. Ruthlessly. Because it has no way of knowing. That voice is designed to assess the motivations of members of an ancestral hunter-gather band. At best, it may have the ability to distinguish the correct murderer from between 2 and 100 possibilities -- 6 or 7 bits of inferential power on the absolute best of days. That may have worked in hunter-gatherer times, before more-closely-causally-linked physical evidence could hope to be evaluated. (Or maybe not—but at least it got the genes passed on.)

DNA analysis, in contrast, has in principle the ability to uniquely identify a single individual from among the entire human species (depending on how much of the genome is looked at; also ignoring identical twins, etc.) -- that’s more like 30-odd bits of inferential power. In terms of epistemic technology, we’re talking about something like the difference in locomotive efficacy between a horsedrawn carriage and the Starship Enterprise. Our ancestral environment just plain did not equip our knowledge-gathering intuitions with the ability to handle weapons this powerful.

We’re talking about the kind of power that allows us to reduce what was formerly a question of human social psychology—who made the decision to kill Meredith? -- to one of physics. (Or chemistry, at any rate.)

But our minds don’t naturally think in terms of physics and chemistry. From an intuitive point of view, the equations of those subjects are foreign; whereas “X did Y because he/​she wanted Z” is familiar. This is why it’s so difficult for people to intuitively appreciate that all of the chatter about Amanda’s “suspicious behavior” with various convincing-sounding narratives put forth by the prosecution is totally and utterly drowned out to oblivion by the sheer strength of the DNA signal pointing to Guédé alone.

This rationalist skill of following the strong signal—mercilessly blocking out noise—might be considered an epistemic analog of the instrumentalshut up and multiply”: when much is at stake, you have to be willing to jettison your intuitive feelings in favor of cold, hard, abstract calculation.

In this case, that means, among other things, thinking in terms of how much explanatory work is done by the various hypotheses, rather than how suspicious Amanda and Raffaele seem.

Conclusion: The Amanda Knox Test

I chose the title of this post because the parallel structure made it sound nice. But actually, I think an hour is a pretty weak upper bound on the amount of time a skilled rationalist should need to arrive at the correct judgment in this case.

The fact is that what this comes down to is an utterly straightforward application of Occam’s Razor. The complexity penalty on the prosecution’s theory of the crime is enormous; the evidence in its favor had better be overwhelming. But instead, what we find is that the evidence from the scene—the most important sort of evidence by a huge margin—points with literally superhuman strength toward a mundane, even typical, homicide scenario. To even consider theories not directly suggested by this evidence is to engage in hypothesis privileging to the extreme.

So let me say it now, in case there was any doubt: the prosecution of Amanda Knox and Raffaele Sollecito, culminating in last week’s jury verdict—which apparently was unanimous, though it didn’t need to be under Italian rules—represents nothing but one more gigantic, disastrous rationality failure on the part of our species.

How did Less Wrong do by comparison? The average estimated probability of Amanda Knox’s guilt was 0.35 (thanks to Yvain for doing the calculation). It’s pretty reasonable to assume the figure for Raffaele Sollecito would be similar. While not particularly flattering to the defendants (how would you like to be told that there’s a 35% chance you’re a murderer?), that number makes it obvious we would have voted to acquit. (If a 65% chance that they didn’t do it doesn’t constitute “reasonable doubt” that they did...)

The commenters whose estimates were closest to mine—and, therefore, to the correct answer, in my view—were Daniel Burfoot and jenmarie. Congratulations to them. (But even they were off by a factor of at least ten!)

In general, most folks went in the right direction, but, as Eliezer noted, were far too underconfident—evidently the result of an exorbitant level of trust in juries, at least in part. But people here were also widely making the same object-level mistake as (presumably) the jury: vastly overestimating the importance of “psychological” evidence, such as Knox’s inconsistencies at the police station, as compared to “physical” evidence (only Guédé′s DNA in the room).

One thing that was interesting and rather encouraging, however, is the amount of updating people did after reading others’ comments—most of it in the right direction (toward innocence).

[EDIT: After reading comments on this post, I have done some updating of my own. I now think I failed to adequately consider the possibility of my own overconfidence. This was pretty stupid of me, since it meant that the focus was taken away from the actual arguments in this post, and basically toward the issue of whether 0.001 can possibly be a rational estimate for anything you read about on the Internet. The qualitative reasoning of this post, of course, stands. Also, the focus of my accusations of irrationality was not primarily the LW community as reflected in my previous post; I actually think we did a pretty good job of coming to the right conclusion given the information provided—and as others have noted, the levelheadedness with which we did so was impressive.]

For most frequenters of this forum, where many of us regularly speak in terms of trying to save the human species from various global catastrophic risks, a case like this may not seem to have very many grand implications, beyond serving as yet another example of how basic principles of rationality such as Occam’s Razor are incredibly difficult for people to grasp on an intuitive level. But it does catch the attention of someone like me, who takes an interest in less-commonly-thought-about forms of human suffering.

The next time I find myself discussing the “hard problem of consciousness”, thinking in vivid detail about the spectrum of human experience and wondering what it’s like to be a bat, I am going to remember—whether I say so or not—that there is most definitely something it’s like to be Amanda Knox in the moments following the announcement of that verdict: when you’ve just learned that, instead of heading back home to celebrate Christmas with your family as you had hoped, you will be spending the next decade or two—your twenties and thirties—in a prison cell in a foreign country. When your deceased friend’s relatives are watching with satisfaction as you are led, sobbing and wailing with desperation, to a van which will transport you back to that cell. (Ever thought about what that ride must be like?)

While we’re busy eliminating hunger, disease, and death itself, I hope we can also find the time, somewhere along the way, to get rid of that, too.

(The Associated Press reported that, apparently, Amanda had some trouble going to sleep after the midnight verdict.)

I’ll conclude with this: the noted mathematician Serge Lang was in the habit of giving his students “Huntington tests”—named in reference to his controversy with political scientist Samuel Huntington, whose entrance into the U.S. National Academy of Sciences Lang waged a successful campaign to block on the grounds of Huntington’s insufficient scientific rigor.

The purpose of the Huntington test, in Lang’s words, was to see if the students could “tell a fact from a hole in the ground”.

I’m thinking of adopting a similar practice, and calling my version the Amanda Knox Test.

Postscript: If you agree with me, and are also the sort of person who enjoys purchasing warm fuzzies separately from your utilons, you might consider donating to Amanda’s defense fund, to help out her financially devastated family. Of course, if you browse the site, you may feel your (prior) estimate of her guilt taking some hits; personally, that’s okay with me.