T Distribution
What it is
The t-distribution (Student’s t-distribution) is a continuous probability distribution used to make inferences about a population mean when the sample size is small and the population standard deviation is unknown. It resembles the normal distribution—bell-shaped and symmetric—but has heavier tails, reflecting greater uncertainty and a higher probability of extreme values.
Key properties
- Degrees of freedom (df) control tail heaviness. For a single sample, df = n − 1. Smaller df → heavier tails. As df increases, the t-distribution converges to the standard normal distribution.
- Mean = 0 (for the standardized t statistic). Variance is greater than 1 for small df and approaches 1 as df → ∞.
- Used as the basis for t-tests and confidence intervals when the population standard deviation is unknown.
When and why to use it
Use the t-distribution when:
* The population standard deviation is unknown, and
* The sample size is small (commonly n < 30), or sample uncertainty is a concern.
Explore More Resources
It corrects for extra uncertainty introduced by estimating the standard deviation from the sample. For large samples, the sample standard deviation becomes a reliable estimate and the normal (z) distribution is typically acceptable.
Formulas and interpretation
Standardized t statistic for a sample mean:
T = (m − μ) / (s / sqrt(n))
where
* m = sample mean
* μ = hypothesized population mean
* s = sample standard deviation
* n = sample size
Explore More Resources
Confidence interval for the population mean:
m ± t × (s / sqrt(n))
where t is the critical value from the t-distribution for the chosen confidence level and df = n − 1.
Example
Suppose a sample of 27 observations yields a mean return of −0.33% and a sample standard deviation of 1.07%. For a 95% confidence interval, the t-critical value with df = 26 is about 2.055. The standard error is 1.07 / sqrt(27) ≈ 0.206%. Multiply by 2.055 → ≈ 0.42%. The 95% CI is therefore −0.33% ± 0.42%, or approximately [−0.75%, +0.09%].
Explore More Resources
Comparison with the normal distribution
- Similarities: Both are symmetric and bell-shaped and assume an underlying approximately normal population.
- Differences: The t-distribution has heavier tails (more probability in extremes), especially for small df. This yields wider confidence intervals and more conservative hypothesis tests when uncertainty about variability is high.
Limitations and assumptions
- Assumes the underlying population is approximately normal when sample size is small. If the population is strongly non-normal, results may be unreliable.
- Should not be used when the population standard deviation is known—use the normal (z) distribution instead.
- For very small samples from non-normal populations, alternative methods (nonparametric tests or bootstrap) may be preferable.
Simple explanation
The t-distribution is like the normal curve but “fatter” at the edges to reflect extra uncertainty when you estimate variability from a small amount of data. It helps you make more cautious, realistic inferences about the population mean.
Quick FAQ
-
When does t approximate the normal?
Typically when sample size is large (rule of thumb n ≥ 30), the t-distribution is close to the normal distribution. -
What are degrees of freedom?
For a one-sample mean, df = n − 1. They reflect the amount of independent information available to estimate variability. -
Can I use t for non-normal data?
For moderate to large samples, the Central Limit Theorem mitigates non-normality. For small samples with strong non-normality, consider nonparametric or resampling methods.
Bottom line
The t-distribution provides a principled way to account for extra uncertainty when estimating a population mean from a small sample with unknown variance. It yields wider, more conservative confidence intervals and tests than the normal distribution, and it converges to the normal as sample size grows.