Underspecified Probabilities: A Thought Experiment

(Inspired by https://​​stats.stackexchange.com/​​questions/​​175153/​​approximating-pa-b-c-using-pa-b-pa-c-pb-c-and-pa-pb-pc)

Imagine you’re working at an electronics manufacturing company. There’s a type of component your company needs for an upcoming production process. Your company can either produce these components in-house or it can source them from a vendor. Your job is to determine whether or not to produce the component in-house. Your team has been budgeted $2,000,000 for procuring as many working components as possible. You won’t know whether a part works until the downstream production process is up and running later in the year, but you need to make a decision on part procurement now. Let’s say your company agrees to pay you a bonus next year proportional to how many parts turned out to be functional and your personal utility function is linear in this bonus and doesn’t depend on anything else.

Using the in-house machine, your $2,000,000 budget will get you 1,000,000 parts which might or might not work. A part will work if and only if it has dimples on the bismuth layer (event ), it has an osmium layer between 40 and 60 microns thick (event ), and the misalignment between the polonium and tungsten layers is less than 20 microns (event ). Thus the expected number of working parts is . The only way to determine whether a part satisfies properties , , and is by doing destructive testing: you have to physically cut it up. Because of the way this cutting works, you can’t measure if you’ve already measured and on a part, you can’t measure if you’ve already measured and on a part, and you can’t measure if you’ve already measured and on a part. The only way to get data on , , and together would be to send a part downstream into production but the production process doesn’t exist yet.

Through destructive testing, you’ve been able to determine that , , , , , and (you’ve tested such a truly unfathomable number of parts that you’ve gotten the uncertainty on these numbers down arbitrarily small).[1] Otherwise, the machine you’re using to create these parts is a black box. The company that sold you the machine has since gone defunct and you can’t find anyone else with prior experience working with this machine.

On the other hand, for $2,000,000 the external vendor is willing to give you parts for some specified . These parts have been used in many other production processes and are known to never, ever fail.

If , you should probably go with the in-house approach. If , you should probably source externally. What is the value of below which you stay in-house but above which you go with the vendor?[2]


I’m working on a project where I’m going to need to estimate probabilities of combinations of events like this with incomplete information. I asked some friends about my problem and they said it was ill-posed in the abstract, so I wanted to create a real-world thought experiment where you can’t wave the problem away with “There is no proper answer.” If you were in this position, you’d have to actually make a choice.

This problem reminds me of the discussion around the presumption of independence. I think that a good philosophical justification for assuming independence in the absence of evidence to the contrary should be able to be generalized to providing a method for finding the “most independent” assignment of probabilities in settings where we know that actual independence is out of the question.

  1. ^

    No pair of these events is independent. To see that there exist two (and by taking convex linear combinations, infinitely many) possible values of from this provided information, note that there are two valid probability mass functions, and , on truth assignments to , , which differ on their value of :

    0.150.14
    0.10.11
    0.20.21
    0.10.09
    0.050.06
    0.10.09
    0.250.24
    0.050.06
  2. ^

    Technically there could be an interval of width where you’re indifferent.