How do errors propagate through our KPI calculations?

Problem

Noise is everywhere, especially in the Key Performance Indicators (KPIs) we use to measure the progress we are making towards our corporate objectives. If it turns out that our KPI estimates are insufficiently repeatable or reproducible to detect the commercially significant signals in our experiments, that mismatch between the capability and the requirements of our KPI measurement process can slow our company’s progress to a crawl. Fortunately, even when a KPI is calculated by combining multiple measurements made using multiple sub-processes according to some complex formula, there’s usually one sub-process in particular that is driving most of the noise in our KPI.

To figure out which sub-process it is, it helps to understand a bit about error propagation. Many of us studied error propagation briefly in high school chemistry or a college laboratory class, then prayed error propagation would be somebody else’s job when we grew up. But now we’re grown up: if there’s no one doing these error propagation calculations for us, then it must be our job to do them ourselves. So what was it we were supposed to remember from those homework assignments on error propagation?

Solution

If you have forgotten that error propagation had something to do with partial derivatives or hand-held calculators, it’s OK to keep forgetting all of that. Instead, focus on internalizing the properties of error propagation that are summarized in this interactive data table and the insights below:

This table illustrates how errors in our estimates of individual means propagate into errors of the corresponding sums, differences, products and quotients that get rolled up into our complex KPI calculations. Whenever any one of the 4 values corresponding to the means and standard deviations of Processes 1 and 2 is changed, all the other values in the table update accordingly. The white cells on the right half of the table illustrate the equations used to calculate the values in the colored cells directly beneath them. Each row of the table has its own independent color scale that draws attention to its largest (red) and smallest (yellow) values.

The formulae in the table above illustrate the following:

  • Under no circumstance do standard deviations or CVs ever propagate by simple addition. 
  • Instead, it’s the squares of the sub-process standard deviations – i.e., the sub-process variances – that are added together to calculate the variance of the corresponding sum or difference.
  • Similarly, it’s the squares of the sub-process CVs that are added together to calculate the square of the CV for the corresponding product or quotient.
  • To estimate the standard deviation of a product or quotient it is necessary first to calculate its CV from the CVs of the individual sub-processes.
  • To estimate the CV for the corresponding sum or difference, it is necessary first to calculate its standard deviation from the standard deviations of the individual sub-processes.

Insights

  • Errors propagate quite differently depending on how we represent those errors, so we must take care to use the appropriate error metric for each of our processes. If we have observed that the standard deviations for Sub-processes 1 and 2 scale in direct proportion with their means, we should emphasize their CVs in our presentations and discourse. If instead we’ve observed that these standard deviations are independent of their means, we should take care never to speak about their CVs. If we don’t know whether or not our process standard deviations scale with their means, we ought to prioritize making those measurements right away!
  • The CV of our KPI can “blow up” if (1) some part of its calculation involves subtracting one noisy measurement from another and (2) their difference is comparable to the noise in either measurement. In order to prevent such “blow ups” from occurring, we must either reduce the CVs of the individual measurements or increase the signal between them (e.g., by lengthening the implicit time interval during which the observed difference in mass or concentration or other property has evolved). In a few cases we can escape this predicament if it turns out the standard deviation of our KPI, not its CV, is actually more relevant to the kind of techno-econometric analysis our companies use to measure technical risk and set corporate goals.
  • If our goal is to operate a 2-fold serial dilution process with CV < 5%, we don’t need the independent dilution events to have CVs less than 5%/2=2.5%. Instead, we only need each dilution to have errors less than 5%/SQRT(2)=3.5%.
  • The more complicated our KPI calculations get, the more challenging it becomes to estimate how the individual errors get rolled up into a single KPI error by chaining the rules of thumb above. Statistical simulation is a relatively simple and straightforward alternative for modeling propagation of error in these more complex systems… Stay tuned for future articles on this subject.
  • The opposite of error propagation is a powerful data analytical method called variance decomposition… Stay tuned for future articles on this subject.

 Interactive visualization was created by Chris @ engaging-data.com.

Share this post

Share on facebook
Share on google
Share on twitter
Share on linkedin
Share on pinterest
Share on print
Share on email
Back to Top