# Question about application of Bayes

I have suc­cess­fully con­fused my­self about prob­a­bil­ity again.

I am de­bug­ging an in­ter­mit­tent crash; it doesn’t hap­pen ev­ery time I run the pro­gram. After much con­fu­sion I be­lieve I have traced the prob­lem to a spe­cific line (ac­ti­vat­ing my de­bug log­ger, as it hap­pens; irony...) I have tested my pro­gram with and with­out this line com­mented out. I find that, when the line is ac­tive, I get two crashes on seven runs. Without the line, I get no crashes on ten runs. In­tu­itively this seems like ev­i­dence in favour of the hy­poth­e­sis that the line is caus­ing the crash. But I’m con­fused on how to set up the equa­tions. Do I need a prob­a­bil­ity dis­tri­bu­tion over crash fre­quen­cies? That was the solu­tion the last time I was con­fused over Bayes, but I don’t un­der­stand what it means to say “The prob­a­bil­ity of hav­ing the line, given crash fre­quency f”, which it seems I need to know to calcu­late a new prob­a­bil­ity dis­tri­bu­tion.

I’m go­ing to go with my in­tu­ition and code on the as­sump­tion that the de­bug log­ger should be ac­ti­vated much later in the pro­gram to avoid a race con­di­tion, but I’d like to un­der­stand this math.

• Is this pro­gram writ­ten in C or C++, by any chance? In that case, you can’t con­clude any­thing from prob­a­bil­is­tic data—be­cause you’re prob­a­bly hunt­ing mem­ory cor­rup­tion, com­ment­ing out lines af­fects mem­ory lay­out, and mem­ory lay­out af­fects the symp­toms of mem­ory cor­rup­tion. Ac­tu­ally, this prob­lem is pretty gen­eral; any­thing that looks like a prob­a­bil­is­tic soft­ware crash needs a source of en­tropy to ran­dom­ize it, and usu­ally this “en­tropy” will be in the form of a frag­ile de­pen­dency on timing or mem­ory ar­range­ment which in­ter­acts with at­tempts to mea­sure it in im­pos­si­bly con­fus­ing ways. (This is why us­ing C and C++ is usu­ally a ter­rible idea, and why threads are so feared.)

• (This is why us­ing C and C++ is usu­ally a ter­rible idea, and why threads are so feared.)

I was with you up to this par­en­thet­i­cal. The same prob­lems af­fect ev­ery other pro­gram­ming lan­guage, though per­haps with differ­ent root causes. Race con­di­tions are hardly lan­guage-de­pen­dent.

• The prob­a­bil­ity of get­ting a race con­di­tion, given that you’re pro­gram­ming in Haskell, is or­ders of mag­ni­tude lower than the prob­a­bil­ity of get­ting a race con­di­tion, given that you’re pro­gram­ming in C or C++.

• That may well be true. I do have some other con­straints, though. For ex­am­ple, does Haskell have a well-sup­ported GUI library that works on Win­dows and talks to OpenGL?

• Haskell re­quires bend­ing your brain in ways that are of­ten un­com­fortable to non-math­e­mat­i­ci­ans. I, per­son­ally, use Python, which strikes a bal­ance be­tween race avoidance and library sup­port. I meant more to ar­gue that race con­di­tions are lan­guage-de­pen­dent than to recom­mend Haskell as a perfect pro­gram­ming lan­guage for any pur­pose.

• Not differ­ent root causes, fewer root causes. Thread­ing prob­lems and race con­di­tions ex­ist in all lan­guages, but mem­ory cor­rup­tion does not; the ma­jor­ity of lan­guages won’t let you do things like write out of bounds or use af­ter free.

• Yes, mem­ory cor­rup­tion in par­tic­u­lar is no longer an is­sue in most higher lan­guages, but mem­ory man­age­ment in gen­eral has be­come more com­pli­cated and less trans­par­ent. The Law of Leaky Ab­strac­tions ru­ins ev­ery­thing.

In any case this is rea­son­ably off-topic for this post, so I won’t be­la­bor the point fur­ther.

• Yes, it’s C++. I’m rea­son­ably con­vinced the prob­lem was a race con­di­tion in ini­tial­is­ing the GUI; it ap­pears that the de­bug log­ger would try to write to a text field that might or might not ex­ist, prob­a­bly be­cause a differ­ent thread was cre­at­ing it. (I don’t have full con­trol of the thread­ing since I’m us­ing Qt for my GUI stuff, so ex­actly when my code is called is un­for­tu­nately a bit black-box-ish.) I be­lieve I re­solved the is­sue by in­tro­duc­ing a spe­cial de­bug stream for use dur­ing startup, which doesn’t try to write to the GUI but in­stead writes to a log file.

• Please note that there are tools to track such mem­ory cor­rup­tion is­sues, like Val­grind. Us­ing C and C++ isn’t such a bad idea in it­self (for some tasks, it is ar­guably the most ap­pro­pri­ate lan­guage) but us­ing them with­out us­ing the re­lated tools to track mem­ory cor­rup­tion is.

• In a Unix en­vi­ron­ment that would have been my first thought. How­ever this is Win­dows. (Yes, yes, C++ on Win­dows, my geek pe­nis just shrank two inches, what­ever.) As far as I could tell from a quick Google, val­grind isn’t sup­ported on Win­dows. (I’ll be delighted if any­one tells me this is wrong.)

gdb is sup­ported on Win­dows, and I did use that; but the pro­gram ob­sti­nately re­fused to crash in the de­bug­ger, pre­sum­ably be­cause the timing changed. Race con­di­tions for the win…

• Well, the prob­lem is that you have un­cer­tainty over the prob­a­bil­ity of the code crash­ing with or with­out fix­ing the par­tic­u­lar bug you’re look­ing for. What you need, in or­der to ap­ply Bayes, is:

• A prior model ‘pr(f)’ of the fre­quency of the code crash­ing with the bug pre­sent. Which is a dis­tri­bu­tion over crash fre­quen­cies. 1/​(f(1-f)) is an ig­no­rance prior that might do the job.

• I’d as­sume the prob­a­bil­ity of the code crash­ing with the bug fixed is 0, even if that’s not strictly true (since there could be other bugs).

• A prior like­li­hood ‘b’ of the bug be­ing on that line of code. This could de­pend (for in­stance) on how you iden­ti­fied that par­tic­u­lar line to com­ment out in the first place. I’ll call your ev­i­dence from run­ning the pro­gram E1 (2 crashes out of 7) and E2 (10 runs no crashes)

P(bug on that line| E1, E2) = P(E1, E2 | bug on line) P(bug on that line) /​ (P(E1, E2)

P(bug on that line) = b

P(E1, E2) = P(E1, E2 | bug on line) + P(E1, E2 | bug el­se­where)

P(E1, E2 | bug on line) = P(E1)P(E2 | bug on line) (since the 2 crashes out of 7 are in­de­pen­dent of the bug lo­ca­tion)

P(E1, E2 | bug el­se­where) = P(E1)P(E2 | bug el­se­where) (as above)

P(E2 | bug on line) = 1

Given a fre­quency of crash­ing ‘f’, P(E2 | f, bug el­se­where) = (1-f)^10

P(E1|f) = f^2 * (1-f)^5

So, then you need to in­te­grate over all pos­si­ble val­ues of ‘f’: P(E1) = In­te­gral over [0,1] of: P(E1|f)pr(f)df

P(E2 | bug el­se­where) = In­te­gral over [0,1] of: P(E2 | f, bug el­se­where)pr(f)df

That’s ev­ery­thing you need, the rest is just pick­ing those pri­ors and in­te­grat­ing back up the line. Of course the re­sults are only as good as the pri­ors. A much eas­ier solu­tion is: “The chance of a crash ap­pears to be about 27. The chance of get­ting 10 non-crashes is (5/​7)^10 ~= 3.5% ” Note that this is not the same as the above, it’s an ap­prox­i­ma­tion, but it’s prob­a­bly go­ing to be just as good as do­ing it the hard way.

In­ci­den­tally, you need to be aware that, par­tic­u­larly with in­ter­mit­tant bugs, just be­cause com­ment­ing out a line stops the crash (even when you’re 100% sure of the cor­re­la­tion) that doesn’t mean the line it­self is the prob­lem. Bugs can be ab­solutely patholog­i­cal. For ex­am­ple, if the prob­lem is, say, free­ing the same mem­ory twice, then any line that calls a lot of mem­ory al­lo­ca­tions will in­crease the fre­quency of crashes even if the prob­lem is ac­tu­ally ear­lier in the code. Also, if the bug is over­run­ning the end of an ar­ray, tak­ing out any line can have a chaotic effect on the op­ti­miser, mov­ing the rel­a­tive lo­ca­tions of things in mem­ory around and caus­ing the bug to dis­ap­pear with­out fix­ing it (only for it to reap­pear later). On a sim­pler level, tak­ing a line out might change the ex­e­cu­tion path avoid­ing the bug with­out fix­ing it. There seems to be no end to ways in which im­pos­si­ble seem­ing things can hap­pen in com­puter code.

• Bugs can be ab­solutely patholog­i­cal.

This is very true. I sim­plified the in­for­ma­tion a bit be­cause I was post­ing about the math as a mat­ter of in­tel­lec­tual cu­ri­os­ity, not to get help de­bug­ging. I have a model of what was caus­ing the crash that I find rea­son­ably con­vinc­ing, which I out­lined in my re­sponse to jim­ran­domh, be­low. So, while it’s a real-world prob­lem, for pur­poses of math we can as­sume that the effect of com­ment­ing out the line is an in­di­ca­tion of a point bug, as it were. I found the Bayes con­fus­ing any­way, so there’s no need to com­plex­ify fur­ther. :)

Thanks for the math an­swer; I need to think about it care­fully to ab­sorb it fully, but I thought I’d re­spond to the pro­gram­ming an­swer first.

• If you are talk­ing about what “causes” what, maybe you should think about the prob­lem causally first, not as a stan­dard hy­poth­e­sis test­ing prob­lem (al­though hy­poth­e­sis test­ing may reap­pear later). What does it mean for a line to be a cause of bad be­hav­ior? What is “a causal effect”? Often peo­ple for­mal­ize causal effect as a differ­ence of means be­tween the “con­trol group” and a “test group.” In your case the con­trol group is the origi­nal pro­gram, and the test group is the pro­gram where you in­ter­vened to com­ment out the offend­ing line, say line 500.

You have pro­gram out­put O, which is ei­ther good (o1) or bad (o2). You have the origi­nal pro­gram, where noth­ing was done to it, where you get crashes some­times P(O = o2) > 0. You also have an al­tered pro­gram, where you in­ter­vened to com­ment out line 500, say, where P(O = o2 | do(line500 = noop)) = 0.

The statis­tic you want is, for ex­am­ple, E(O) - E(O | do(line500 = noop)). If this statis­tic is not zero, we say line 500 has a causal effect on the crash.

Since you can just in­ter­vene di­rectly in your sys­tem, you can just gather enough sam­ples of this statis­tic to figure out whether there is a causal effect or not. In sys­tems that are not com­puter pro­grams, peo­ple of­ten can­not in­ter­vene di­rectly, and so re­sort to “trick­ery” to get statis­tics like the above mean differ­ence.

If this seems sim­ple, that’s be­cause it is. This setup mir­rors how peo­ple ac­tu­ally de­bug—they in­ter­vene in sys­tems and com­pare with “the test group,” some­times do­ing mul­ti­ple runs if the bug is a “Heisen­bug.”

There is also the is­sue of whether you can re­ally treat the out­puts of re­peated pro­gram runs as iid sam­ples. Some­times you can, of­ten you can­not, as other posters pointed out.

• If the ex­am­ple is in­tended to be con­crete rather than ab­stract… First un­der­stand what that line does. Shot­gun de­bug­ging is sel­dom a good idea.

• Sure. Although I was in­spired to think about this Bayesian prob­lem by an is­sue I was hav­ing with cod­ing, I’m not ask­ing about the cod­ing prob­lem, which I have got un­der con­trol. I’m ask­ing about the math. So it’s an ab­stracted ver­sion of a prob­lem I was con­cretely hav­ing, but the con­crete prob­lem is solved.

• The first step is to be clear about what you’re ask­ing and what you’re try­ing to ac­com­plish.

A whole bunch of things could be said to “cause” the crash, such as the power sup­plied to the sys­tem. You want a pro­gram that does what you want, and doesn’t crash. You could im­me­di­ately exit the pro­gram and likely always avoid a crash. The lack of such an exit could be said to be “the cause” of the crash. But what good does iden­ti­fy­ing such a cause do you?

A cause isn’t nec­es­sar­ily “the thing that needs to be changed”, though peo­ple of­ten treat it that way.

Do I need a prob­a­bil­ity dis­tri­bu­tion over crash fre­quen­cies?

I think the data you have, and some ig­no­rance prior, and an in­de­pen­dence as­sump­tion of crash tri­als (which I think is likely a false as­sump­tion), you have Bernoulli tri­als and can as­sign a prob­a­bil­ity dis­tri­bu­tion to each of the two states—with and with­out the line.

But I don’t know what good those dis­tri­bu­tions do you.

Why are you as­sign­ing prob­a­bil­ity dis­tri­bu­tions in­stead of do­ing more de­bug­ging? What do you ex­pect to do with those dis­tri­bu­tions? What does the line do? Why not just re­move it?

Ba­si­cally, what’s the prob­lem you’re try­ing to solve?

• I’d recom­mend us­ing the beta dis­tri­bu­tion. I’d recom­mend Jeffrey’s prior (beta(1/​2,1/​2)), though I don’t fully un­der­stand it.

I’ve only ever used it to figure out the prob­a­bil­ity of get­ting heads on the next step, but you could just mul­ti­ply these to­gether to get the prob­a­bil­ity of the se­quence, so it’s 0.5/​1x1.5/​2x2.5/​3x3.5/​4x4.5/​5x4.5/​6x4.5/​7 if that is where the bug is (since it failed five times then passed twice, or might as well have since or­der doesn’t re­ally mat­ter, and clearly it will pass the next ten) and 0.5/​1x1.5/​2x2.5/​3x3.5/​4x4.5/​5x4.5/​6x4.5/​7x5.5/​8x6.5/​9x...x14.5/​17 if you didn’t find it. The first seven terms will can­cel, so you just get that it’s 5.5/​8x...x14.5/​17 = (14.5!/​4.5!)/​(17!/​7!) = 0.0906 times as un­likely if there’s no bug.

• I’m afraid I don’t quite see how to ap­ply this to the prob­lem. The beta dis­tri­bu­tion is pre­sum­ably a prob­a­bil­ity, but what is it a prob­a­bil­ity of? Is there an in­ter­pre­ta­tion to its two pa­ram­e­ters that I’m not see­ing?

• It’s a prob­a­bil­ity dis­tri­bu­tion of prob­a­bil­ities. You don’t know how likely it is that the pro­gram crashes given that there’s no bug. You just know that it crashed two out of seven times you ran it. If you start with a beta dis­tri­bu­tion for how likely it is to have differ­ent prob­a­bil­ities of crash­ing, you’ll have an easy-to-calcu­late beta dis­tri­bu­tion for how likely it is af­ter, and it’s easy to calcu­late ex­actly how likely it is to crash given that dis­tri­bu­tion of prob­a­bil­ities.

• Let me see if I un­der­stand what you are say­ing. You seem to be anal­o­gis­ing the prob­lem to de­tect­ing whether a coin or die is bi­ased? That is, a bi­ased die has some fre­quency of com­ing up 6 which is not one-sixth; and you are propos­ing beta(0.5, 0.5) as my prior dis­tri­bu­tion for that fre­quency. I roll the die 100 times, get­ting some num­ber of sixes; for each point in the [0, 1] space of pos­si­ble fre­quen­cies, I do Bayes. This pre­sum­ably con­cen­trates my prob­a­bil­ity around some par­tic­u­lar fre­quency, and if that fre­quency is differ­ent from 0.16667 I say the die is bi­ased.

Now, the anal­ogy is that com­ment­ing out the line is equiv­a­lent to re­mov­ing a piece of gum from the die. So I perform the test twice, once with and once with­out the gum; and if the re­sults differ then I say that the gum changed the bias, and if the un-gummed die is fair then the gum caused the bias. Or, go­ing back to the pro­gram, the mean of my prob­a­bil­ity dis­tri­bu­tion for the crash fre­quency in the pres­ence of the line may be high rel­a­tive to the mean with­out the line, and that’s what we mean by “We think this line is caus­ing the crash”. Right?

So, if I un­der­stand cor­rectly, you are propos­ing the same math as gjm did be­low, but sug­gest­ing a spe­cific prior

P(p, q) = beta(p; 0.5, 0.5) beta(q; 0.5, 0.5).

Is that cor­rect?

• No. My way was as­sum­ing that it ei­ther crashes ex­clu­sively be­cause of that line or ex­clu­sively be­cause of some­thing else. Fur­ther­more, I only gave pri­ors for if it’s given which it’s do­ing.

Let A = that line is the cause of the crash.

Let q!=x mean that this is true for any value of q be­sides x.

P(p,0|A) = beta(0.5,0.5)(p)

P(p,q!=0|A) = 0

P(p,p|!A) = beta(0.5,0.5)(p)

P(p,q!=p|A) = 0

• You need a prob­a­bil­ity dis­tri­bu­tion over some­thing like pairs (crash prob­a­bil­ity with­out this line, crash prob­a­bil­ity with this line). So the like­li­hood for (p,q) is Pr(2 crashes of 7 with the line | p) Pr(0 crashes of 10 with­out the line | q) = 21 p^2 (1-p)^5 (1-q)^10, and pre­sum­ably you’ve got a prior that prefers small p, small q, and p ~= q.

Then you could look deeper into the kind of thing that might be go­ing on and con­sider hy­pothe­ses like “there’s a race con­di­tion that hap­pens early in the pro­gram when de­bug log­ging is ac­tive, be­cause of the ex­tra time taken writ­ing to the log” but it’s go­ing to be re­ally tough as­sign­ing the rele­vant con­di­tional prob­a­bil­ities. In prac­tice, if a lot of your prob­a­bil­ity in the ini­tial (p,q) thing ends up in re­gions where p<q by a sub­stan­tial mar­gin, it’s prob­a­bly rea­son­able to say that adding the line causes the pro­gram to mis­be­have.

• So, if I un­der­stand cor­rectly, my pos­te­rior prob­a­bil­ity is a dis­tri­bu­tion over the (p, q) space; and pre­sum­ably, the par­tic­u­lar data I’ve got be­comes more likely as p ap­proaches 27 and q ap­proaches 0. So if I had, say, a uniform prior, or a prior that was uniform in the lower-left cor­ner where both p and q are small, then I’d move to­wards zero q and nonzero p, or in other words “The line causes the crash”. That seems to make sense.

• As a rel­a­tively new mem­ber of this com­mu­nity I was won­der­ing what im­pli­ca­tions you could fore­see be­cause of this:

http://​​un­der­stand­in­guncer­tainty.org/​​court-ap­peal-bans-bayesian-prob­a­bil­ity-and-sher­lock-holmes

I know this might not be the right place to post, but Karma dis­al­lowed me from post­ing any­where else, and I was hop­ing to get some thoughts on this.

• I would kind of strongly sug­gest that a months-old thread is not the right place for ques­tions about cur­rent events; if noth­ing else, no­body but me is likely to see the dis­cus­sion. Ad­di­tion­ally, al­though both are in some sense tagged ‘Bayes’, the dis­cus­sions are very differ­ent. I sug­gest you take this to the most re­cent open thread.

• Can you sug­gest where a Karma-less ac­count (new) can post then? As ev­ery­thing ends up as draft for me.

• Clearly you can post com­ments in threads, right? We have a biweekly open thread; dig down a bit in dis­cus­sion, or wait for the one that’ll be posted around March 1st. Alter­na­tively, pick a post on the front page, not this an­cient one. I only saw this be­cause I get no­tifi­ca­tions of re­sponses to my threads; no­body else is go­ing to be see­ing any­thing this deeply buried. You may still get some ob­jec­tions to off-top­ic­ness, but at least you won’t be thread-necro­manc­ing in ad­di­tion.

• . Do I need a prob­a­bil­ity dis­tri­bu­tion over crash fre­quen­cies? That was the solu­tion the last time I was con­fused over Bayes, but I don’t un­der­stand what it means to say “The prob­a­bil­ity of hav­ing the line, given crash fre­quency f”, which it seems I need to know to calcu­late a new prob­a­bil­ity dis­tri­bu­tion.

What you want to find is P(line causes crash|ob­served crashes)

let’s call

L: “line is cause of crash”
R: “crash is ran­dom” (so ba­si­cally R = ~L)
O: “ob­served crash dis­tri­bu­tion”

p(L|O) = p(O|L)p(L)/​p(O) = p(O|L)p(L)/​(p(O|L) + p(O|R))

If your prior is that L and R are equally likely, this becomes

p(L|O) = 0.5/​(1+p(O|R)/​p(O|L))

… so you just need the ra­tio be­tween p(O|R) and p(O|L).

To get that, you may want to com­pare two mod­els, one where the crash oc­curs ran­domly with prob­a­bil­ity pr, one when the crash oc­curs only when the line is pre­sent with prob­a­bil­ity pl (pr and pl with some sim­ple prior dis­tri­bu­tion like Jeffrey’s prior, as DanielLC recom­mends).