Statistical Significance: What It Is, How It Works, and Examples
Statistical significance is a conclusion from hypothesis testing that an observed relationship or difference in data is unlikely to have occurred by random chance alone. Analysts use it to decide whether data support a systematic effect (e.g., a drug reduces symptoms, market returns changed after an event) rather than mere random variation.
How it works
- Formulate hypotheses:
- Null hypothesis (H0): the observed effect is due to chance (no real effect).
- Alternative hypothesis (H1): there is a real effect.
- Perform a statistical test appropriate to the data (t-test, chi-square, regression, etc.).
- Compute a p-value: the probability of observing results as extreme as (or more extreme than) those seen, assuming the null hypothesis is true.
- Compare the p-value to a preselected significance level (α), commonly 0.05 (5%):
- If p ≤ α → reject the null hypothesis and conclude the result is statistically significant.
- If p > α → fail to reject the null hypothesis; the data are consistent with chance.
What a p-value means (and what it doesn’t)
- It quantifies how surprising the observed data would be if H0 were true.
- A small p-value suggests the data are unlikely under H0 and supports H1.
- It is not the probability that H0 is true, nor a measure of effect size or practical importance.
Examples
- Finance: An analyst compares average daily returns before and after a company’s sudden failure to test for evidence of advance knowledge. A p-value of 0.28 (>0.05) indicates the observed difference is compatible with chance; a p-value of 0.0001 (<0.05) would be strong evidence against the chance explanation and warrant further investigation.
- Medicine: A randomized 26‑week trial of a new insulin reports a p-value of 0.04. Since 0.04 < 0.05, the reduction in diabetes-related measures is deemed statistically significant, informing regulators, clinicians, and investors.
Common uses
- Clinical trials and regulatory evaluation of drugs, vaccines, and medical devices.
- Scientific research across disciplines to test hypotheses.
- Business and finance for event studies, product effectiveness, and decision-making based on data.
Limitations and cautions
- Statistical significance ≠ practical or clinical significance. Small effects can be statistically significant with large samples; large effects can fail to reach significance with small samples.
- It does not prove causation; study design (randomization, controls) and context determine causal interpretation.
- Results depend on sample size, measurement quality, and chosen test; p-values can be sensitive to these factors.
- Best practice: report p-values alongside effect sizes, confidence intervals, and clear description of methods and assumptions.
Key takeaways
- Statistical significance assesses whether observed data are unlikely under a chance-only explanation.
- Hypothesis testing produces a p-value; a common threshold for significance is 0.05.
- Use significance tests as one part of interpretation—consider effect size, confidence intervals, study design, and practical relevance.
References
- StatPearls Publishing. “Statistical Significance” (2023).
- American Diabetes Association. Onset 7 Trial: Efficacy and Safety of Fast‑Acting Aspart Compared With Insulin Aspart.
- Hwang, T. J., et al. “Stock Market Returns and Clinical Trial Results…” PLoS One (2013).
- Rothenstein, J., et al. “Company Stock Prices Before and After Public Announcements Related to Oncology Drugs.” Journal of the National Cancer Institute (2011).