I completely agree that debugging processes and confirmation biases interact (and I’ve seen the effect enough times in practice to understand that my first three hypothesis iterations about nearly any aspect of a giant data set will probably be wrong enough to detect if I try), but for the sake of debiasing our thinking about debiasing techniques, I wonder about opinions about things we should expect to help rather than hurt?
For example, I find sometimes that I create pipelines of data where there will be a few gigs stored at each stage with less than a hundred lines of code between each stage. If I test the data at each stage in rather simple ways (exploring outliers, checking for divide by zero, etc) that I gain additional “indirect code verification” by noting that despite the vast space of possible output data with 2^10^10 degrees of freedom, my scripts are producing nothing that violates a set of filters that would certainly be triggered by noise and “probably” (craft and domain knowledge and art certainly goes into this estimate) be triggered by bugs.
A summary of these experiences would be something like “the larger the data set the less likely that bugs go undetected”. And programmable computers do seem to enable giant data sets… so...
I’m not sure I have enough information to say whether the expected rate of bug-based confirmations should be 20% or 90% when certain kinds of software or coding techniques are involved, but it seems like there can be other “big picture factors” that would reduce biases.
Do you agree that larger data sets create more room to indirectly verify code? Can you think of other factors that would also push scientific programming in a positive direction?
I completely agree that debugging processes and confirmation biases interact (and I’ve seen the effect enough times in practice to understand that my first three hypothesis iterations about nearly any aspect of a giant data set will probably be wrong enough to detect if I try), but for the sake of debiasing our thinking about debiasing techniques, I wonder about opinions about things we should expect to help rather than hurt?
For example, I find sometimes that I create pipelines of data where there will be a few gigs stored at each stage with less than a hundred lines of code between each stage. If I test the data at each stage in rather simple ways (exploring outliers, checking for divide by zero, etc) that I gain additional “indirect code verification” by noting that despite the vast space of possible output data with 2^10^10 degrees of freedom, my scripts are producing nothing that violates a set of filters that would certainly be triggered by noise and “probably” (craft and domain knowledge and art certainly goes into this estimate) be triggered by bugs.
A summary of these experiences would be something like “the larger the data set the less likely that bugs go undetected”. And programmable computers do seem to enable giant data sets… so...
I’m not sure I have enough information to say whether the expected rate of bug-based confirmations should be 20% or 90% when certain kinds of software or coding techniques are involved, but it seems like there can be other “big picture factors” that would reduce biases.
Do you agree that larger data sets create more room to indirectly verify code? Can you think of other factors that would also push scientific programming in a positive direction?