How well did we measure our standard deviation?

Problem

Higher signal-to-noise ratios spur faster product development, so how noisy is our process exactly? The more accurately we can answer that critical question, the more accurately we can predict how often our manufacturing process will make out-of-spec materials when we scale it up, or how many independent replicates we will need to run through our noisy measurement process to detect a 10% improvement.

This much is obvious: the more times we replicate a process, the better we will understand its true underlying distribution. But here’s what’s not obvious:

  • how dreadfully our sampled standard deviation estimates its true value when N = 2 or 3
  • how rapidly that estimate improves with the 4th, 5th and 6th replicates
  • how slowly that estimate improves with the 7th, 8th and 9th replicates

Solution

We can measure how well we have measured our process standard deviation using the confidence intervals below. If the uncertainty in our current estimate of process variation is unacceptably large, we can gather more data with the current process or try to figure out the underlying physical, chemical or biological mechanisms driving its variability.

The interactive visualization above plots up to 3 pairs of relative confidence intervals for estimating standard deviation as a function of the number of replicates (N). By default, the 95%, 80% and 50% confidence intervals are shown, but other confidence intervals can be specified. Use the radio buttons, slider and toggle to calculate the uncertainty in our estimates of process variation. Rolling over the underlined text within the text box above will highlight the corresponding inputs and/or graph elements.

Although the illustrated confidence intervals are centered, they are not symmetric: in other words, even though there is an equal likelihood that the true value of standard deviation is greater or less than each pair of curves, it is simultaneously more likely to be greater than its measured value. 

Insights

  • There’s no way around it: the only way to measure the variability of our process is to repeat it. Never ever use sample size < 6 for this purpose (i.e., never use < 5 degrees of freedom). If replicates are hard to come by, use the visualization above or the equations below to estimate what we can afford to do and the costs of being wrong.
  • That said, we can get just as good an estimate of our underlying process variation by repeating each of 10 conditions 3 times as we can by running a single condition 21 times – as long as we can justify the assumption that what’s driving the variability of our process acts with a similar magnitude on all 10 conditions.
  • When we replicate a handful of conditions across all 96 wells of a microwell plate, there’s still a >10% chance the true process variation is >1.10X its measured value for any one plate.
  • When we measure the reproducibility of our processes using 2 different instruments or 3 different operators (think “Gauge R&R”, “assay qualification” or “assay validation” studies), those estimates are uncertain to a degree that corresponds to the N = 2 and 3 slices of the chart above. In other words, if we do this kind of classic experiment and measure a commercially significant difference between instruments or operators, consider ourselves lucky! It’s too easy for these expensive experiments to lull us into a false sense of security. Don’t let them be a substitute for doing mechanism-oriented DOEs during method development and maintaining control charts throughout operations.
  • When we perform triplicates we often think, “If two measurements are close together and the third is far apart, we can improve our estimate of the true value by excluding that outlier from our calculation of the mean.” But there’s a >25% chance any two points sampled from a normal distribution will have a standard deviation <0.3X its true value! In order to judge whether or not one of three replicates is truly outlying, try instead to use estimates of process variation gathered from recent experiments that tested conditions where we’d expect to experience a similar degree of variation. 
  • No process is normally distributed until we make it normally distributed! If we are operating with small N, the chances are high that our true process variation is even larger than indicated by the UCLs on the chart above. In order to characterize the non-normality or stability of our processes (e.g., using a control chart) we should collect data sets with at least 30 degrees of freedom.

Equations

 Interactive visualization was created by Chris @ engaging-data.com.

Share this post

Back to Top