Residual Sum of Squares (RSS)

The residual sum of squares (RSS) quantifies the discrepancy between observed values and the values predicted by a regression model. It is the sum of the squared residuals (errors) and is widely used to assess how well a model fits data: lower RSS indicates a better fit, and RSS = 0 means a perfect fit.

Definition and interpretation

Residual (error) for observation i: ei = yi − f(xi)
RSS = sum of squared residuals = Σ (yi − f(xi))^2

RSS measures unexplained variance — the portion of data variation not captured by the model. It is used in statistics, econometrics, finance, and machine learning to evaluate model performance and to fit models (e.g., via least squares).

Explore More Resources

Calculation

Given observed values yi and predicted values f(xi) for n observations:
RSS = Σ_{i=1}^n (yi − f(xi))^2

A related quantity, the residual standard error (RSE), scales RSS by sample size:
RSE = sqrt(RSS / (n − p))
(where p is the number of estimated parameters; for simple linear regression p = 2)

Explore More Resources

Relation to other metrics

RSS is also called SSE (sum of squared errors or sum of squared estimates of errors).
Total sum of squares (TSS) measures total variation in the observed data. TSS = ESS + RSS, where ESS is the explained sum of squares.
R-squared (coefficient of determination) expresses explained variation as a proportion:
R^2 = 1 − (RSS / TSS)

Minimizing RSS (model fitting)

Ordinary least squares (OLS) finds parameter values that minimize RSS for linear models.
In nonlinear or high-dimensional contexts, minimization may use iterative algorithms (gradient descent, numerical optimization).
Overfitting risk: a highly flexible model can drive RSS down but may not generalize. Regularization (ridge, lasso) or validation is used to balance fit and generalization.

Limitations

Sensitive to outliers: RSS gives equal weight to all squared residuals, so large errors dominate.
Assumption-dependent: OLS and inference based on RSS assume linearity, independence of errors, and homoscedasticity. Violations can bias estimates and invalidate tests.
Not ideal for comparing models with different numbers of parameters because RSS typically decreases as complexity increases—adjusted metrics (AIC, BIC, adjusted R^2) are preferable for model comparison.
Provides only an aggregate error measure and limited insight into the structure of relationships between variables.

Practical considerations

Large datasets and many predictors benefit from automated tools (statistical software, Python/R libraries) to compute RSS and fit models reliably.
In finance and machine learning, RSS remains a core diagnostic but is typically combined with cross-validation, regularization, and other metrics to assess model utility.
Always inspect residuals (plots, summary statistics) to check assumptions and identify outliers or patterns indicating model misspecification.

Example (summary)

A simple regression relating consumer spending (CS) to GDP for a set of countries might produce a best-fit line such as:
GDP = 1.3232 × CS + 10447
Predicted GDP values differ from actual GDP values; squaring and summing these differences yields the RSS for that model. The chosen line minimizes RSS among possible lines for the given data.

FAQs (brief)

Is RSS the same as R-squared? No. RSS is the absolute sum of squared residuals; R-squared is the proportion of total variation explained (1 − RSS/TSS).
Is RSS the same as SSE? Yes — RSS and SSE are synonymous.
Can RSS be zero? Yes — only if the model predicts every observation exactly (perfect fit), which is rare in practice.

Summary

RSS is a fundamental measure of model fit that quantifies unexplained variance. Minimizing RSS underpins least squares estimation, but practitioners should be mindful of outliers, underlying assumptions, and overfitting. Combine RSS-based fitting with diagnostic checks and complementary metrics to build reliable, interpretable models.