Hypothesis Testing
Hypothesis testing is a statistical method for evaluating an assumption about a population using data from a sample. It helps determine whether an observed pattern can be reasonably explained by chance or whether it provides evidence for an alternative explanation.
How it works
A researcher formulates two mutually exclusive statements:
* Null hypothesis (H0): the default statement assumed true (e.g., a population mean equals a specific value).
* Alternative hypothesis (Ha): what the researcher suspects instead (e.g., the mean differs from that value).
Explore More Resources
Using a random sample from the population, the analyst applies an appropriate statistical test (t-test, z-test, chi-square test, etc.) to assess how consistent the observed data are with H0. The result is usually summarized by a test statistic and a p-value, which quantify how likely the observed outcome (or one more extreme) would be if H0 were true.
Based on a preselected significance level (commonly 0.05), the analyst either:
* Rejects H0 (concluding the data provide sufficient evidence for Ha), or
* Fails to reject H0 (concluding the data do not provide sufficient evidence against H0).
Explore More Resources
Note: Standard phrasing is “fail to reject H0” rather than “accept H0,” because failing to reject does not prove the null hypothesis true.
4-step process
- State the hypotheses: define H0 and Ha clearly.
- Plan the analysis: choose the test, significance level (α), and sample method.
- Analyze the data: compute the test statistic and p-value.
- Interpret the result: reject or fail to reject H0 and report the conclusion in context.
Example: testing a coin
Question: Is a penny fair (heads probability = 0.5)?
Explore More Resources
- H0: P(heads) = 0.5
- Ha: P(heads) ≠ 0.5
If a sample of 100 flips yields 40 heads:
* Calculate the probability of observing 40 or fewer heads (and the symmetric tail) under H0.
* If that probability (p-value) is very small relative to α, reject H0 and conclude the coin likely isn’t fair.
If the sample shows 48 heads and 52 tails, the p-value will typically be large enough that we fail to reject H0, meaning the result is plausibly due to chance.
Explore More Resources
Simple explanation
Hypothesis testing is a structured way to compare explanations. You propose a default explanation (H0) and an alternative, collect data, and use statistics to judge which explanation the data support.
Brief history
Early forms of hypothesis testing date back centuries; one early example is John Arbuthnot’s 1710 analysis of birth records, which used probability arguments to assess whether observed patterns could be due to chance.
Explore More Resources
Benefits
- Provides a formal, repeatable framework for evaluating claims using data.
- Reduces reliance on intuition or bias by grounding decisions in statistical evidence.
- Helps quantify uncertainty and supports decision-making in science, business, and policy.
Limitations and common pitfalls
- Results depend on data quality, sample size, and the appropriateness of the chosen test.
- Misinterpretation of p-values and overreliance on arbitrary significance thresholds can mislead conclusions.
- Hypothesis testing can produce errors:
- Type I error: rejecting a true null hypothesis (false positive).
- Type II error: failing to reject a false null hypothesis (false negative).
- Tests don’t prove hypotheses true; they only evaluate consistency between the data and H0.
Conclusion
Hypothesis testing is a foundational statistical tool for assessing whether observed data support or contradict a specific assumption about a population. By following a clear four-step process and understanding the limitations and possible errors, researchers can draw informed, data-driven conclusions.