Residual Standard Deviation
What it is
Residual standard deviation (also called the residual standard error or standard error of the estimate) quantifies how far observed values fall from the values predicted by a regression model. It is a measure of the typical size of prediction errors (residuals): smaller values indicate that the model’s predictions are closer to the actual data.
Key takeaways
- Measures the typical distance between observed values and model predictions.
- Serves as a goodness-of-fit metric for regression models.
- In simple linear regression, the denominator uses n − 2 (two parameters estimated: slope and intercept).
- Useful in business to compare predicted vs. actual costs and to quantify forecast uncertainty.
Formula
For simple linear regression:
S_res = sqrt( sum( (Y − Y_est)^2 ) / (n − 2) )
Explore More Resources
Where:
* S_res = residual standard deviation
* Y = observed value
* Y_est = predicted (fitted) value from the model
* n = number of observations
(For models with p estimated parameters, replace n − 2 with n − p.)
Explore More Resources
How to calculate (step by step)
- Fit your regression model and compute predicted values Y_est for each observation.
- Compute residuals: r_i = Y_i − Y_est,i.
- Square each residual and sum them: SSE = sum(r_i^2).
- Divide SSE by the appropriate degrees of freedom (n − 2 for simple linear regression).
- Take the square root: S_res = sqrt(SSE / (n − 2)).
Worked example
Suppose a fitted line is y_est = x + 2 and you have four observations:
- x = 1, observed y = 1 → predicted y_est = 3 → residual = 1 − 3 = −2
- x = 2, observed y = 4 → predicted y_est = 4 → residual = 0
- x = 3, observed y = 6 → predicted y_est = 5 → residual = 1
- x = 4, observed y = 7 → predicted y_est = 6 → residual = 1
Compute SSE = (−2)^2 + 0^2 + 1^2 + 1^2 = 4 + 0 + 1 + 1 = 6
Degrees of freedom = n − 2 = 4 − 2 = 2
S_res = sqrt(6 / 2) = sqrt(3) ≈ 1.732
Explore More Resources
Interpretation: a typical prediction error is about 1.732 units. The smaller this number relative to the original data spread, the better the model’s predictive accuracy.
Practical use and notes
- Goodness-of-fit: S_res provides an absolute measure of fit in the same units as Y; it complements R^2 and other fit statistics.
- Business applications: used to assess how closely predicted costs or sales match actuals and to quantify forecast uncertainty.
- ANOVA and LoQ: residual standard deviation arises naturally in analysis of variance and is sometimes used in limit-of-quantitation calculations.
- Generalization: in multiple regression, degrees of freedom become n − p (p = number of estimated parameters), so use S_res = sqrt(SSE / (n − p)).
Bottom line
Residual standard deviation summarizes the typical size of prediction errors from a regression model. It’s a straightforward, interpretable measure of model accuracy: the smaller S_res is (relative to the data’s variability), the more reliable the model’s predictions.