Variance: Definition, Formula, and Example
Variance measures how spread out values in a data set are around their mean. It quantifies variability by averaging the squared deviations of each observation from the mean. The symbol for variance is σ² (population) or s² (sample). The square root of variance is the standard deviation (σ or s), which returns variability to the original units.
Formulas
- Population variance:
σ² = Σ(xi − μ)² / N - xi = each value
- μ = population mean
-
N = number of values in the population
-
Sample variance (unbiased estimator of population variance):
s² = Σ(xi − x̄)² / (n − 1) - x̄ = sample mean
- n = sample size
Note: Variance has squared units (e.g., if data are in dollars, variance is in dollars²). Standard deviation is in the original units.
Explore More Resources
How to calculate variance (step by step)
- Compute the mean (x̄ for a sample, μ for a population).
- Subtract the mean from each data point to get deviations.
- Square each deviation.
- Sum the squared deviations.
- Divide by N (population) or (n − 1) (sample).
Example (finance)
Returns for a stock over three years: 10%, 20%, −15%.
- Mean: (0.10 + 0.20 + −0.15) / 3 = 0.05 (5%)
- Deviations from mean:
- 0.10 − 0.05 = 0.05
- 0.20 − 0.05 = 0.15
- −0.15 − 0.05 = −0.20
- Squared deviations:
- 0.05² = 0.0025
- 0.15² = 0.0225
- (−0.20)² = 0.0400
- Sum of squares = 0.0025 + 0.0225 + 0.0400 = 0.0650
- If treating this as a sample (n = 3), sample variance:
s² = 0.0650 / (3 − 1) = 0.0325 (3.25%)
Standard deviation = √0.0325 ≈ 0.1803 (18.03%)
If you treated the three observations as the entire population, divide by N = 3 instead, giving a population variance of 0.021667 (standard deviation ≈ 14.72%).
Explore More Resources
Pros and cons
Pros
– Simple and widely understood measure of dispersion.
– Treats positive and negative deviations equally (squares deviations).
– Prevents cancellation that would occur if deviations were summed directly.
Cons
– Squaring gives extra weight to outliers, which can skew results.
– Units are squared, which can be unintuitive (standard deviation is often preferred).
– Often used as an intermediate step (e.g., to compute standard deviation) rather than a final interpretive statistic.
Explore More Resources
Uses
- Assessing volatility and risk in finance (higher variance often implies greater risk).
- Comparing dispersion across data sets.
- Input for other statistical methods (e.g., analysis of variance, regression diagnostics).
Why standard deviation is often used instead
Standard deviation (the square root of variance) restores the original units, making interpretation and comparisons easier (for example, comparing volatility across assets expressed in different units or magnitudes).
Key takeaways
- Variance quantifies how much data points differ from the mean by averaging squared deviations.
- Use N in the denominator for a full population and (n − 1) for a sample to obtain an unbiased estimate.
- Standard deviation is commonly used for interpretation because it shares the data’s original units and is easier to compare.