Key Takeaways
- A confidence interval gives a range likely to contain the true population parameter
- 95% confidence means if you repeated the sampling 100 times, ~95 intervals would contain the true mean
- Larger sample sizes produce narrower (more precise) confidence intervals
- Higher confidence levels produce wider intervals (trading precision for certainty)
- The margin of error equals z-score times standard error
What Is a Confidence Interval?
A confidence interval (CI) is a range of values that is likely to contain an unknown population parameter. Instead of estimating the parameter by a single value, an interval is given along with a confidence level that specifies how certain we are that the parameter lies within the interval.
For example, if you calculate a 95% confidence interval for the mean height of adult males and get [5'8", 5'11"], you can say: "We are 95% confident that the true average height of adult males in the population falls between 5'8" and 5'11"."
The Confidence Interval Formula
CI = x +/- z * (s / sqrt(n))
Z-Score Reference Table
| Confidence Level | Z-Score | Alpha (a) |
|---|---|---|
| 80% | 1.282 | 0.20 |
| 90% | 1.645 | 0.10 |
| 95% | 1.960 | 0.05 |
| 99% | 2.576 | 0.01 |
| 99.9% | 3.291 | 0.001 |
How to Interpret Confidence Intervals
A common misconception is that a 95% confidence interval means there's a 95% probability that the true parameter is within that specific interval. The correct interpretation is:
- If we repeated our sampling procedure many times and calculated a 95% CI each time, approximately 95% of those intervals would contain the true population parameter.
- Any single interval either contains the true value or it doesn't - we just don't know which case applies.
- The 95% refers to the reliability of the method, not the probability for any single interval.
Factors That Affect Confidence Interval Width
1. Sample Size (n)
Larger sample = Narrower interval. Doubling your sample size reduces the margin of error by about 30% (dividing by sqrt(2)). This is why larger studies provide more precise estimates.
2. Confidence Level
Higher confidence = Wider interval. A 99% CI is wider than a 95% CI because we need a larger range to be more certain we've captured the true value.
3. Standard Deviation
Higher variability = Wider interval. When data is more spread out, there's more uncertainty about where the true mean lies.
Frequently Asked Questions
Use the t-distribution when: (1) your sample size is small (typically n less than 30), and (2) the population standard deviation is unknown (you're using the sample standard deviation). For large samples (n greater than 30), the t and z distributions are nearly identical, so either works.
Rearranging the formula: n = (z * s / E)^2, where E is your desired margin of error. For example, for a 95% CI with standard deviation 10 and margin of error 2: n = (1.96 * 10 / 2)^2 = 96.04, so you need at least 97 samples.
Yes, confidence intervals can include negative values if the mean is close to zero relative to the margin of error. This often happens with difference-in-means tests or when measuring changes that could go either direction.
A confidence interval estimates where the population mean lies. A prediction interval estimates where a single new observation will fall. Prediction intervals are always wider because they account for both uncertainty about the mean AND the natural variability of individual observations.
95% is a convention that provides a good balance between precision and reliability. It corresponds to the widely-used p less than 0.05 significance level in hypothesis testing. However, the "right" confidence level depends on your context - medical studies often use 99% while preliminary research might use 90%.