VALUE
IMPROVEMENT
LEADERS
|
|
|
744 words + 1 activities (+1 optional video)
| 33 min (3 to read, 30 on Excel)
|
|
PRINCIPLE
Standard deviation summarizes dispersion but can be misleading. Proceed with caution.
TOOLS
• Population Standard Deviation =STDEV.P(data range)
• Sample Standard Deviation: =STDEV.S(data range)
• Numerous Excel statistical functions
APPLICATION
Explore your process’s data with Excel.
|
|
Previously, I urged the use of histograms and run charts when discussing continuous data because they’re much better than point estimates like mean and standard deviation. Today’s discussion is about the meaning of standard deviation and concludes…SPOILER ALERT…histograms and run charts are still superior.
|
|
This is Standard Deviation
|
|
I could end the discussion here I suppose.
Last time, of standard deviation, I said, “People don’t know what it is or how to use it.”
Well, Google knows: “Standard deviation is…
- “…a measure of how spread out the numbers are.” (Math is Fun)
- “…commonly used to understand whether a specific data point is ‘standard’ and expected, or unusual and unexpected.” (Jeremy Jones)
- “…an essential statistical tool for increasing quality.” (winspc.com)
- “…pushing obsolescence.” Sam L. Savage
Sam Savage wrote the book
Flaw of Averages
and considers standard deviation about as relevant today as the steam engine when evaluating variation, but we’ll press on anyhow.
|
|
No Matter Your Unit of Measure, Standard Deviation is a Unit of Distance
Whether your continuous data is measured in days, dollars, white blood cells, or whatever, think of standard deviation as a
length
. It tells you how far your data is spread out from the arithmetic center (aka “mean”).
If your data is normally distributed
, about 68% of your data will fall within +/-1 standard deviation. Thus, if your normal data has a mean of 36 and a standard deviation of 6, it’s safe to estimate that ~68% of your data is between 30 and 42. About 95% of your data will be within +/-2 standard deviations (24-48) and ~99.7% will be within +/-3 standard deviations. Here’s all that visualized:
|
|
The average height of an American adult male is 5’10, with a standard deviation of 3 inches. Using the 68-95-99.7 rule, this means that 68% of American men are 5’10 plus or minus 3 inches, 95% of American men are 5’10 plus or minus 6 inches, and 99.7% of American men are 5’10 plus or minus 9 inches. So, this means only about 0.3% of American men deviate more than 9 inches from the average, with 0.15% taller than 6’7 and 0.15% shorter than 5’1. This reasoning suggests that LeBron James is 1 in 2500 and Yao Ming is 1 in 450 million (if Yao Ming was American).
|
|
|
What If Your Data Ain't Normal
In general, the assumption of normality, which we were all taught, is overblown. You don’t need normal data to make use of the two-sample t-test, ANOVA, confidence intervals, control charts, or regression (unless you want to predict individual data points). These tests are “robust” to normality but there are caveats (e.g. minimum sample size).
Standard deviation tells you nothing about the skewness of your data (“skewness” being another term that few people know what to do with). These green, blue, and orange distributions have the same standard deviation but, as shown in the title blocks, their asymmetry shifts the data past the +/-1 and +/-2 boundaries so the 68-95-99.7 rule can’t be applied.
For non-normal data, if you find you must use numeric definitions of variation, quartiles are the better choice.
|
|
What is Standard Deviation Good For?
- T-tests which compare means. These are available to you in Excel or online and require almost no knowledge of statistical minutia.
- The F test (sensitive to normality) which compares the variance of two samples. This is useful when you want to know if you’ve reduced variation after implementing a new process. Also available in Excel or online.
|
|
- Coefficient of Variation. When searching for opportunities to improve, a good outcome measure is coefficient of variation (CV) which measures standard deviation as a fraction of the mean. The higher the CV, the more volatility, and the more likely the process is due for some standardization.
|
|
Don’t Feel Bad For Standard Deviation
After all, it’s not going anywhere. Just be aware of its ample limitations. We have a
video, "What is Sigma" (13 min)
that explains a little more (very optional).
|
|
LINKS
Quickly locate all course videos, slides, and previous emails
here
.
|
|
|
|
|
|
|