Empirical Rule: Definition, Formula, and Example
Definition
The empirical rule (also called the three-sigma or 68–95–99.7 rule) describes how data are distributed in a normal (bell‑shaped) distribution. It states:
- About 68% of observations lie within one standard deviation of the mean (µ ± σ).
- About 95% lie within two standard deviations (µ ± 2σ).
- About 99.7% lie within three standard deviations (µ ± 3σ).
Formula
For a normally distributed variable with mean µ and standard deviation σ:
* One standard-deviation interval: µ − σ to µ + σ → ~68% of data
* Two standard-deviation interval: µ − 2σ to µ + 2σ → ~95% of data
* Three standard-deviation interval: µ − 3σ to µ + 3σ → ~99.7% of data
Explore More Resources
How to use the rule
- Estimate the proportion of observations falling in a range around the mean when the distribution is approximately normal.
- Quickly check for departures from normality: unusually many points beyond µ ± 3σ suggests non‑normality (skewness, heavy tails, outliers).
- Set control limits in quality control charts (three‑sigma limits) and make rough risk assessments.
Example (zoo animal lifespans)
Suppose lifespans are normally distributed with mean µ = 13.1 years and σ = 1.5 years.
- 1σ range: 13.1 ± 1.5 → 11.6 to 14.6 (≈ 68% of animals)
- 2σ range: 13.1 ± 3.0 → 10.1 to 16.1 (≈ 95% of animals)
- 3σ range: 13.1 ± 4.5 → 8.6 to 17.6 (≈ 99.7% of animals)
Probability an animal lives longer than 14.6 years:
* 14.6 is the upper bound of the 1σ interval. About 32% lie outside ±1σ; half of that (16%) lie above 14.6. So the probability ≈ 16%.
Explore More Resources
Applications in finance and investing
- Many financial returns are not perfectly normal (they can be skewed or have fat tails), so the empirical rule may understate the probability of extreme events.
- Despite limitations, analysts often use standard deviation (volatility) as a risk measure. To compute volatility from returns in a spreadsheet:
- Compute periodic returns (e.g., daily percent changes).
- Use STDEV(range) to get the standard deviation of those returns.
- Annualize a daily standard deviation by multiplying by sqrt(252) (approximately the number of trading days in a year).
- A larger standard deviation indicates greater dispersion and higher implied risk.
Benefits and limitations
Benefits:
* Fast, intuitive estimate of spread and likelihoods for approximately normal data.
* Useful for preliminary analysis, quality control limits, and basic risk assessment.
Limitations:
* Only accurate for distributions that are approximately normal.
* Fails to capture skewness and fat tails common in many real‑world data sets (especially financial returns), so it can underestimate extreme outcomes.
Explore More Resources
Explain like I’m five
If numbers form a bell shape around the middle, most of them are close to the middle: about two-thirds are within the first step away, nearly all are within two or three steps. The steps are measured by the “standard deviation,” which tells how spread out the numbers are.
Bottom line
The empirical rule offers a simple way to relate standard deviation to the probability of observations falling within specified ranges for roughly normal data. Use it as a quick guide for forecasts and control limits, but verify normality and be cautious when dealing with skewed or heavy‑tailed data (common in finance).