War in Iran by 2016 might be a possible candidate.
rolf_nelson
I’ve created a rebuttal to komponisto’s misleading Amanda Knox post, but don’t have enough karma to create my own top-level post. For now, I’ve just put it here:
- 21 Feb 2010 20:22 UTC; 0 points) 's comment on The Amanda Knox Test: How an Hour on the Internet Beats a Year in the Courtroom by (
- 2 Feb 2010 4:47 UTC; 0 points) 's comment on Debunking komponisto on Amanda Knox (long) by (
“So I was very surprised to find Adams was a believer in and evangelist of something that sounded a lot like pseudoscience.”
Yep. The Dilbert Future isn’t online so you can’t see the nonsense directly, but to get a feeling for what Adams was like before he started backpedaling recently:
(http://www.reall.org/newsletter/v05/n12/scott-adams-responds.html)
Unflattering, but (to memory) accurate, description of The Dilbert Future here:
Thanks for another well-researched reply, let’s have a couple more posts on this, and then turn to the DNA for a bit.
On the other hand, if we do take systemic uncertainty into account (as we ultimately must), a shift of 15:1 or even 5:1 would be significant, given your estimate of .95 probability of guilt, or 19:1 odds.
The problem is that systemic uncertainty works both ways. If I see there being, say, 10 times as much evidence for guilt then there is for innocence, I’ll still cap the probability of guilt at .95 anyway, due to systemic uncertainty. If I change my mind and decided there was 5, or 20, times as much evidence for guilt, the basic conclusion won’t change.
To look at it another way, I expect that if we examine ten pieces of evidence as to whether the Earth is flat, on average one of the pieces can easily point to the Earth being flat at a 10:1 ratio by chance. You would need to either have a much stronger piece of evidence among the first ten pieces, or else have more than one of the pieces point to the Earth being flat, to show that something is true given the first ten pieces of evidence.
How much slippage do you think may have occurred?
There’s a ton of factors here, I’ll guess that if there’s slippage, it’s about 50% that the entire contents would slip; probably our digestion process is evolutionary designed to make the food pass through easily by that stage. Probably another 50% that a suspiciously large amount of food would be found in the small intenstine. I could narrow it down more if I knew how large the volume of the first part of the small intestine to the first bend is, whether the first part of the small intestine was searched, whether the rest of the small intestine was searched, how fast food is evacuated from the duodenum, whether food keeps getting evacuated from the duodenum after the stress of being threatened with a knife occurs, whether peristalsis continues to push food through the small intestine after stress, whether diffusion of food through the small intestine walls continues after stress or even death, and how fast peristalsis and diffusion work.
I’ll add that a search for ‘”empty duodenum” forensics’ suggests to me that, as far as I can tell, almost nobody except for Amanda Knox’s defense has ever cared whether a duodenum was empty or not. That probably puts an upper-bound on how useful this evidence is; if it were reliable, I would expect it come up more often in online appeals-courts decisions, and in trial reporting. I also can’t find any literature on this, which is odd if it’s a useful way to narrow down time of death. So based on the “evidence of absence”, let me propose the following hypothesis:
Vacant Duodenum Hypothesis: “An empty duodenum is not, by itself, definitive proof for or against any time-of-death. The main reason to search the duodenum is in hopes of actually finding food there; no matter what the time-of-death scenario, there is always at least a 1⁄10 chance that the duodenum will be empty when examined.”
If you can find a reference to support the idea that a lag time in excess of four or even three hours would not be highly unusual for a small-to-moderate pizza meal eaten by a healthy adult human, I will update appropriately.
So far, we don’t have data either way about lag times (not median) for a pizza, nor how a follow-up snack affects it. BTW do you know something I don’t about the size of the pizza?
Conversely, if you can’t (and I haven’t been able to so far), I don’t see how you can derive the level of uncertainty you need to make the Massei theory plausible in the face of all the other data. Even acknowledging the wide variation in lag times depending on the type of meal used in the studies, they are all on the short end; there is no indication, anywhere (that I have come across), of the kind of extremes that we would need at the long end.
So far there’s no indication of >180 or even >120 either, right? Is the main point of disagreement that if you see the numbers:
10, 25, 23, 82, 48
and if a genie tells you the next number is above 150, then you’re saying “it’s almost certainly between 150 and 180!” and I’m saying “these numbers are all over the place, it’s more likely to be near 150 than near 300, but there’s a signficant chance it’s a lot bigger than 150.”
I’m confused about the notation T(50): does this refer to half-time, or total emptying time? Because the 317 minutes for fried pasta was total emptying time.
My bad, I misread the abstract. Doesn’t significantly change the scenarios though.
Unfortunately, one of the many scandals of this case is that the body temperature measurement was delayed until 12 hours after the discovery of the body, limiting its usefulness.
So are you claiming that Meredith’s weight before losing blood was 57kg, or just pointing out that a weight of 50-55 kg only shifts us by about 10:1?
Dang, civil-case reversal rates are much higher than the U.S. (http://scholar.google.com/scholar?cluster=11027117874758072323), I still can’t find anything on criminal cases though. komponisto said about 1/3, any cite on that?
There’s lots of interesting high-profile Italian murders on Wikipedia, but after excluding those related to the mafia, terrorism, or serial-killers, there’s not much recent activity left. Still, three of the ones I found (the Cogne homicide, the Novi Ligure murder, and the “Beasts of Satan”) were partially or fully upheld, and the fourth (Nicholas Green) was reversed from acquittal to conviction. (I guess there’s no double-jeopardy protection in Italy, since that would deprive them of additional opportunities to reverse. ) So I’ll poke around a bit more when I get a chance, but so far a 50⁄50 bet is feeling moderately advantageous to me, even with the DNA review results.
Hey Kevin, thanks for pinging me, sounds exciting. I’d bet Knox’s odds are only somewhat better than the average guilty defendant of release on appeal, that “somewhat better” based on her having a more expensive legal and PR team than the average defendant. Can’t find such info easily though, I’ll google around tonight. Wikipedia says we’re still on the first of the two mandatory appeals, do you mean released on the first appeal or on any appeal? What if it’s remanded back to the lower court? Also, I assume you mean ‘released on the murder charge’, not on libel (although that might be “time served” anyway by that time.)
For tax purposes, wagering a charity donation would probably be better, but cash might be doable, I’ll need to think about it. Let me anyway research tonight how favorable p(successful appeal | guilt) looks to me.
That said, from a “getting to the truth” perspective, I still think a 1-on-1 debate is a better way of getting to the truth in the Kercher case than this wager, given the additional uncertainty of p(successful appeal | guilt).
I completely agree with you that surrounding my statements with self-doubt would have increased my karma.
However, I do not agree that this is a sufficient reason to surround my statements with self-doubt. As I said, I don’t care about karma.
I do care about things that often correlate with karma, such as accuracy and insight. If there is evidence that surrounding these statements with self-doubt will increase my accuracy, I will do it. Therefore, I look forward to any evidence proffered that my claim C1 is incorrect (such as an argument that one of komponisto’s four statements I’ve charged to be misleading is, in fact, correct). So far, I have not heard any such evidence in this thread.
Try to look at the current voting pattern on the comments to “Amanda Knox Test” and tell me there’s not a correlation between favoring Knox’s innocence and getting upvoted. (Don’t forget to load all the comments so you see the people who are negative despite making reasoned comments about the case.)
I would sooner hypothesize that Meredith’s last meal actually took place closer to 19:00 than 18:00, given the vagueness of the testimony on the matter. This puts her within 2 standard deviations, perhaps even 1.5.
If we model the meal start-time as a normal distribution, then it’ll be simple to add it to the model and combine it with the other sources of uncertainty, since two normal distributions sum to a new normal distribution with a variance equal to the sum of the variances. Though now that I mention it, a lot of the other bits of uncertainty might be somewhat log-normal because they might multiply the time rather than add to it.
But, granting a non-normal distribution, it’s really difficult for me to see how it could significantly work against Raffaele, given where the 25th and 75th percentiles are. Probability mass would have to be transferred to the extreme right tail from somewhere else; how do you propose to do this in a way that isn’t specifically tailored to yield the desired bottom line?
To give two contrasting examples, something like female heights (http://www.johndcook.com/blog/2008/07/20/why-heights-are-not-normally-distributed/) would work against Raffaele because outliers are few and extreme, while a gently bimodal distribution like human heights (http://www.johndcook.com/mixture_distribution.html) might work in Raffaele’s favor because of a concentration in the center.
My questions, in that case, are:
(1a) What does your gastric lag-time model look like, such that you don’t get significantly more surprised by going out to 22:00 than 21:00?
Good question. Let me look here at some more papers. One source of uncertainty is that I don’t know if we care in this case about 2% or 10% or something else.
The first completely-ungated study I found in Google shows 10 minutes for a 2% decrease (http://jnm.snmjournals.org/content/32/7/1349.full.pdf).
Second study shows 25 minutes for a 10% decrease (http://jnm.snmjournals.org/content/32/7/1349.full.pdf).
Third study shows 23 minutes using multiple methods (http://jnm.snmjournals.org/content/37/10/1639.full.pdf).
The gated study you cited shows 81.5 minutes using unknown-to-me methods, perhaps the meal was larger or different from the other studies.
So I guess I would reluctantly discard the concept of attempting solely normal distributions, since this already is looking too right-tailed. So this is too complex for me to easily model, I can only say that intuitively even if we use 19:00, then if a genie tells me it’s at least 120 minutes, then I wouldn’t be much more surprised by 150 minutes or 180 minutes. The first three studies above looked like they were behaving at 10, 25, and 23, and then your example jumped to more than 3x the highest figure so far. So jumping again to even 3x of your number wouldn’t be more than a one-in-ten surprise, especially given the numerous factors I’ve itemized.
(2) What is your probability of guilt, conditioned on death having occurred (a) before 21:30? (b) before 22:00?
If we’re not taking systemic uncertainty into account, then it’s still going to be quite a large probability of guilt. However, I would say that, compared with 23:00, (a) would shift me by about 15:1 on the grounds that the computer evidence would have to be mis-analyzed, or (more likely) Raffaele would have had to manufacture the computer alibi (recall Raffaele is a computer engineer), and (b) by 5:1 on the grounds that the timetable gets a bit tighter than in the 23:00 case. Keep in mind that I’m currently not yet bothering to weigh the eyewitness testimony at all in my assessment of guilt.
Slippage is a priori unlikely, especially with the ligatures applied (professional opinion), and hence given a level of gastric contents consistent with the meal in question, there’s no reason to believe any significant slippage occurred.
I believe the independent court expert more than hearsay that an unknown FRCPath claimed that, even without ligatures, complete slippage is “well-nigh impossible”.
And note this: “The lag phases after 4 and 10% (v/v) ethanol, beer, and red wine were not significantly different from that of water… the inhibitory effect of ethanol and alcoholic beverages is mainly induced by a prolongation of the gastric emptying phase (without affecting the lag phase)...”
That’s a good point, so I hereby drop the alcohol point altogether for the non-slippage case.
Here is another source characterizing any lag time over 150 minutes as “extremely delayed”. By comparison, “normal” is 50-100 min and “delayed” is 100-150. For half-emptying time, over 200 minutes is “extremely delayed”.
This seems to be for small easily-digested test meals, as far as I can tell. No hospital is going to serve a patient a pizza to determine how well their diabetes is under control. ;-)
Just how large do you think the standard deviation is? If you believe in the Massei theory, you have to come up with a lag time of four hours at minimum. I can’t find any evidence that that is anywhere close to being within normal human parameters. Can you?
I see that large, fried, and/or starchy meals have much larger T(50) times than other meals, and I don’t have any lag times for those. Since T(50) times are frequently unexpectedly large, and since lag times correlate in some large but unknown way with T(50) times, I infer a significant probability that lag times are frequently unexpectedly large as well.
Let me float one scenario. I’d presume that starch increases the T(50) time so much because it can take a long time for large amounts of starch to convert to sugar in the stomach. Does almost the entire portion of starch need to get converted to sugar before any starch can go to the duodenum? If so, then the lag time for a large starch meal would be close to the T(50) time.
On the other hand, if you want to believe the time of death was earlier, you run into other problems...
Sounds like a whole other discussion.
So what is your probability distribution for time of death?
Based on just stomach evidence, and ignoring expert testimony, I’d have to say it most likely happened around 19:00. So that’s not very useful.
If we take a leap of faith and use the 317 minutes T(50) for 700 kcal fried pasta but don’t believe the starch needs to convert first, then I’d revert to a 1⁄4 guess for lag time on the basis the ratio decreases as T(50) grows, resulting in 80 +/- 6 minutes, so that model fails for me as well, dang it.
Factoring in that it wasn’t before 21:00, but still ignoring expert testimony, I’ll have to take an “inside view” and try to generate hypotheses as to why it took so long. I’ll currently guess that for to get us out to 21:00, either the starch needs to convert to sugar first (40%), or else there was slippage after the body was discovered (5%), or that there was slippage when the body was moved by one or more perps before being discovered by the police and “ligatured” (55%). I’m open to other suggestions. Unfortunately the gated 81-minute median study isn’t currently helpful in this regard, because I have to ask myself, why was this study 81 minutes, instead of the others that were 25 or 10 or 40 minutes? But if we can find out whatever X factor increased it to 81 minutes, then might be able to judge how much of that X factor we had in our case, and whether we had more or less X factor than in the study. Anyway, overall I’ll guess 30% for 21:00-21:30, 20% for 21:30-22:00, 25% for 22:00-23:00, 7% for 23:00-23:30.
Now let’s factor in expert testimony. Since none of our models are working very well, and since the literature that I’ve seen doesn’t converge on a single simple model anyway, I think in the end I’ll go with the independent expert testimony. The experts have access to gated medical journals and even some kind of summary chart of different times under different situations in the literature, as well as forensic experience, which I don’t have. They also get to factor in the body temperature, which I’ve been ignoring.
So making progress would probably require us to pick a small number of narrowly-defined issues to hash out, one at a time.
Sounds good, like you suggested let’s cover the time of death, and also continue to go deep on the question of lab contamination.
Here’s an important question to assess whether we’ve said anything important yet: has anything I’ve said surprised you?
It hasn’t been predictable, but it hasn’t caused me to shift significantly in favor of innocence or guilt so far. I did learn I was wrong about which knife Amanda reacted strongly to, but that’s within the bounds of how many errors I expected to be making here.
I suspect I could probably get you to agree that it would be extremely unusual for no food to have passed into the duodenum 5 hours after a meal (as required by the prosecution theory), even conditioned on the already unusual fact of none having passed after 150 minutes. However, I can’t predict how far you will lower your probability of guilt as a result.
I haven’t looked into this much. According to Massei, Umani Ronchi, a court-appointed expert, testified that a farinaceous meal takes 6-7 hours for gastric emptying, and additionally that it’s possible some of the food passed into the duodenum but then, after death, slid into the small intestine. Massei also claims that even Vinci agreed with the range of 18:50 − 4:50 for time of death. Did the defence experts take into account the composition of the meal, or testify that sliding of the food after death is unlikely?
I don’t think I would have a problem positing that the expert report constitutes 50:1 evidence in favor of contamination, possibly much more.
The sample in question (Trace B) tested negative for blood, as did every other sample taken from the blade. (Samples from the handle were not tested for blood.) No attempt was made to scientifically determine the actual nature of the alleged biological material.
OK, what are the odds that a small dna trace left by “stabbing + cleaning” would test positive for blood, and what are the odds a small dna contamination to the knife would test positive for blood? (By the way, do you have a specific contamination hypothesis in mind?) In both cases, keep in mind only one “small zone” of the striation was tested for blood, and the rest of the striation was consumed in DNA analysis.
When “quantification” (test to determine whether there was enough DNA to be analyzed) was performed, Traces B and C both yielded a result of “too low”. Stefanoni reported Trace B as a positive result, and Trace C as a negative result, without any justification. There is no documentation in the lab data to support her statement in court that the Trace B sample was in the range of several hundred picograms. Stefanoni also claimed to have executed steps in the quantification procedure that are not documented.
Sounds like she didn’t document everything; how much are you shifting based on this? Part of the problem is I don’t know how much the average technician documents, so I don’t know how usual or unusual this is. If nobody documents everything, but we still see a .02/homicide contamination rate, then Stefanoni’s not documenting doesn’t change anything.
The “amplification” (chemical copying of the sample in order to produce a large enough amount for analysis) was performed only once, despite the fact (admitted by Stefanoni) that it should be repeated in order to be considered reliable.
Can I get a source for Stefanoni’s admission? Is this from the report?
Stefanoni did not perform negative controls, which could have indicated the presence of contamination.
Is this also from the report? Have you translated this part yet?
The sample was analyzed in the same laboratory at the same time as numerous samples containing Meredith Kercher’s DNA.
I gave a .05 chance that, if there was a cross-contamination, it would have been of Meredith’s DNA. Are you giving a different probability?
Thanks k, today I’ll give my thoughts on the knife. I’m sure there are some mistakes in my analysis below, but let’s see if we can start to pinpoint areas of disagreement. Let “ddk.lc” be the specific hypothesis that the double-dna knife was accidentally contaminated in the laboratory, and “a.g” be the hypothesis that Amanda is guilty.
I want to estimate the base rate of lab cross-contamination in the late 2000′s. Two observations:
Looks like Washington State only admitted to one case of laboratory cross-contamination in homicide cases from around 2001-2003, based on SPI. Maybe there were 2 to 5 that weren’t noticed or otherwise were unreported. Washington State has about 200 homicides/year, I’d guess about 1⁄3 go to state labs?
Cases of mistaken “cold matches” that are investigated by the police but turn out to be cross-contamination seem to be extremely rare.
Contamination rates have probably declined slightly since 2001-2004, (widespread DNA forensics is relatively new), so I’m guessing a base-rate of about one accidental cross-contamination per 50 homicides.
(Can that base rate be applied to this case? I could lower it a little based on the first independent report [EDIT: Sorry, I mean the trial judges’ sentencing report] affirming the results, on it being a high-profile case, on the defense being allowed to participate in the testing but declining to fully participate, on no “smoking gun” piece of sloppiness found, and on the defense not stressing any history of past contamination. I could raise it a little based on the second independent report criticizing the lab for not following “international protocols” (though I wouldn’t particularly expect them to), and on the low-count DNA. Not having much data here, I’ll stick with the base rate as my “wild-ass guess”.)
Here’s some more WAGs. If there’s a single lab contamination, the odds that it contaminates a likely murder weapon at the cottage is about .002. The odds that the DNA spread is Meredith’s, rather than one of the many people associated with this case or with other cases the lab is processing in parallel, is about .05.
So I assess P(lc) as .02, and P(ddk.lc | lc) as .0001, which gives P(ddk.lc) as .00002, assuming complete innocence.
In contrast, I estimate P(ddk | a.g) is about .05. So if we exclude the large “systemic uncertainty” of my analysis, the DNA evidence on the double-dna knife alone would make me shift by a factor of 2500 to 1 in favor of a.g rather than ddk.lc.
Let me touch on the circumstantial evidence around the knife, with the caveat that there’s even more systemic uncertainty than in my DNA analysis.
The knife was on top of the other knives, and matched one of the murder weapons: meh, shift by a factor of 2 to 1 in favor of a.g over ddk.lc.
The same knife was bleached, and the other knives weren’t: Shift by a factor of 5 to 1.
Raffaele, at one point, claimed that Meredith visited his cottage and pricked herself on the knife: Shift by a factor of 100.
Amanda’s reaction to the knife: Shift by 10.
Where Amanda’s DNA was found on the handle suggests someone stabbing rather than cooking with it: meh, shift by 2.
Amanda wrote in her diary speculating whether Raffaele may have framed her by pressing the knife in her hand while she slept: Shift by 10, this is not something you’d write if you knew the knife isn’t a murder weapon.
So for the non-DNA evidence around the knife I give a slightly larger shift (200000 to 1), but paradoxically I assign it less importance because there’s more systemic uncertainty and more guess-work on my part, compared with the DNA evidence.
That’s a problem that would presumably exist to some extent on any public forum. I’m not too bothered by it myself; my feeling is that in such a situation one is not necessarily obliged to reply to all comments individually.
We could access-control the main section of a Google Group or a mini-blog, creating a separate area for comments if we like. That would also be convenient because I can easily notice if you’ve posted when I check Google Reader.
Here’s my suggestion: why don’t we try it here first, and see how it works? I’d be interested to see if this kind of thing can work on LW.
That is an excellent reason to do it here, then; go ahead and create a new thread.
Interestingly, I agree with you that the knife and bra clasp are the strongest pieces of prosecution evidence (though I would have put them in the reverse order). However, they’ve been pretty severely punctured by Conti and Vecchiotti in their report. You can read the conclusions of that report here.
If their main point is that the evidence doesn’t meet the standard of scientific rigor, then I might not disagree with them on anything factual. Very little evidence, either way, does meet the standard of scientific rigor. Fingerprints never reach the standard of scientific rigor. DNA testing, as practiced, probably rarely if ever meets the standard of scientific rigor. Eyewitness testimony obviously can never come close to meeting the standard of scientific rigor. Heck, most science doesn’t meet the standard of scientific rigor. We still need to evaluate evidence on its full merits.
So, to confirm that our initial analysis here diverges, around how much are you currently shifting based on the test results for the knife and for the bra clasp? For you, did either one shift P(guilt) by a factor of 100? 10? Not at all? It’s a difficult question to answer in a calibrated way, so if you want to skip that one I’ll understand.
You can guess my hypothesis for why the DNA tests came out the way they did. Do you have a specific alternative hypothesis or hypotheses you want me to consider? Is your main claim here that you believe the knife was accidentally contaminated in the laboratory, and the bra strap was accidentally contaminated in Kercher’s room? Or is there a different alternative hypothesis I should consider first?
I assume we’re not going to quash any evidence, since we’re a court of Bayes and not a court of law? That is, I’m proposing that whenever we want to exclude or diminish evidence, we should have a Bayesian reason for why the evidence doesn’t really alter P(guilt). The proposal is partly because it’s the correct Bayesian thing to do, and partly because trying to divine and mimic Italian criminal procedure and admissability rules would add (IMHO unnecessary) additional complexity.
So, komponisto, Kevin, Pavitra, or anyone else, any general thoughts on how to calculate p(K | guilt) or p(K | innocence)? (K meaning Kevin’s claim, that Knox will be released on appeal).
I agree that AI deterrence will necessarily fail if:
All AI’s modify themselves to ignore threats from all agents (including ones it considers irrational), and
any deterrence simulation counts as a threat.
Why do you believe that both or either of these statements are true? Do you have some concrete definition of ‘threat’ in mind?
Matt wrote:
Here’s a source for the ‘three unidentified individuals’ DNA’ claim:
Thanks Matt. While my claim that there are not three unidentified individuals’ DNA on the strap is tangential to C1, I will back it up anyway.
The Daily Mail is a tabloid, rather than a reliable source (in case the headline, ‘The troubling doubts over Foxy Knoxy’s role in Meredith Kercher’s murder’, didn’t give it away) that clearly got the content for the summary article from Wikipedia. In contrast, the more reliable Sunday Times states instead that Meredith’s, Rudy’s, and Raffaele’s DNA were found. Keep in mind that, as of the day before the Daily Mail summary story you mention, there was no media report (even in tabloids) of the three unidentified people; it seems likely the Daily Mail pulled it from either the Friends of Amanda site, or the prior day’s Wikipedia, which has the (non-cited!) claim.
Again, empirically, “:s/Knox is guilty/Knox is innocent/g” helps even more.
Unless people think that “voting up comments you agree with and voting down things you disagree with” only happens on other sites, in which case I’m curious by what mechanism you think this is enforced on this site.
As always, feel free to share your opinion on the matter.
Would you have preferred that neither my post nor komponisto’s “the amanda knox test” were top-level posts, but that we had just both posted them as comments to the original “You Be The Jury” post?
If your thesis is that debunking the content of a featured post in this forum, is not on-topic for this forum, then I personally disagree. If someone posts false information as a featured post, then I personally would prefer to be informed that it is false rather than continue believing false information. There are probably other readers who feel the same, and I hope this post provided such a service to them.
Sounds reasonable. I wonder if there should be more survey style posts then, but on topics that will have verifiable outcomes. For example, one could pick out a topic from one of the prediction markets and discuss that. This would have the advantage that, at the end of the day, if someone come to the wrong conclusion, they would eventually realize they came to the wrong conclusion and have an opportunity to learn something from the exercise.