What is a T-Test?
A t-test is a statistical hypothesis test used to determine whether there is a significant difference between the means of groups. Developed by William Sealy Gosset under the pseudonym "Student," the t-test is one of the most widely used statistical procedures for comparing means when sample sizes are small and population variance is unknown.
T-tests are essential tools in scientific research, quality control, medicine, and social sciences. They help researchers determine whether observed differences between groups are statistically significant or likely due to random chance.
Types of T-Tests
One-Sample T-Test
Compares the mean of a single sample to a known or hypothesized population mean. Use this when you want to test whether your sample comes from a population with a specific mean.
One-Sample T-Test Formula
t = (x-bar - mu0) / (s / sqrt(n))
Two-Sample T-Test (Independent Samples)
Compares the means of two independent groups to determine if they are significantly different. This calculator uses Welch's t-test, which does not assume equal variances.
Welch's T-Test Formula
t = (x-bar1 - x-bar2) / sqrt(s1^2/n1 + s2^2/n2)
Paired T-Test
Compares means from the same group at different times or under different conditions. Used for before-after studies or matched pairs. This calculator handles paired data by computing differences and performing a one-sample t-test on those differences.
Hypothesis Testing with T-Tests
Setting Up Hypotheses
Every t-test involves a null hypothesis (H0) and an alternative hypothesis (H1):
| Test Type | Null Hypothesis (H0) | Alternative (H1) |
|---|---|---|
| One-sample | mu = mu0 | mu != mu0 |
| Two-sample | mu1 = mu2 | mu1 != mu2 |
| Paired | mu_d = 0 | mu_d != 0 |
Significance Level (alpha)
The significance level is the probability of rejecting the null hypothesis when it is true (Type I error). Common choices:
- alpha = 0.05: 5% chance of false positive (most common)
- alpha = 0.01: 1% chance (more conservative)
- alpha = 0.10: 10% chance (more lenient)
Decision Rule
Compare the calculated t-statistic to the critical value from the t-distribution table:
- If |t| > t_critical: Reject H0 (significant difference)
- If |t| <= t_critical: Fail to reject H0 (no significant difference)
Degrees of Freedom
Degrees of freedom (df) determine which t-distribution to use:
Degrees of Freedom
Assumptions of T-Tests
Normality
The data should come from normally distributed populations. The t-test is robust to moderate departures from normality, especially with larger samples (n > 30). Use histograms or normality tests to check this assumption.
Independence
Observations should be independent of each other. For two-sample tests, the groups must be independent. This assumption is critical and cannot be violated.
Equal Variances (Student's t-test)
The classic two-sample t-test assumes equal population variances. Welch's t-test relaxes this assumption, making it more versatile and is recommended unless variances are known to be equal.
Practical Examples
Example 1: Drug Effectiveness
Treatment group: 85, 90, 88, 92, 87, 91
Control group: 78, 82, 80, 79, 81, 83
Question: Is the treatment significantly better than control?
Result: t = 4.21, df = 9.7, p < 0.05
Conclusion: Significant difference; treatment appears effective.
Example 2: Quality Control
Measured weights: 500.2, 499.8, 500.5, 499.5, 500.1
Target weight: 500g
Question: Is the machine calibrated correctly?
Result: t = 0.32, df = 4, p > 0.05
Conclusion: No significant difference from target.
Interpreting T-Test Results
Statistical vs. Practical Significance
A statistically significant result doesn't always mean a practically important difference. With large samples, even tiny differences can be significant. Always consider effect size alongside p-values.
Effect Size (Cohen's d)
Cohen's d measures the standardized difference between means:
- d = 0.2: Small effect
- d = 0.5: Medium effect
- d = 0.8: Large effect
Common Mistakes to Avoid
Using the Wrong Test
Make sure you choose the appropriate test for your data structure. Use paired t-tests for matched pairs, not independent samples t-test.
Ignoring Assumptions
Check normality and variance assumptions before running the test. Consider non-parametric alternatives (Mann-Whitney U, Wilcoxon) if assumptions are severely violated.
Multiple Testing
Running multiple t-tests inflates the Type I error rate. Use ANOVA for comparing more than two groups, or apply corrections like Bonferroni.
T-Test vs. Z-Test
| Feature | T-Test | Z-Test |
|---|---|---|
| Sample Size | Any size | Large (n > 30) |
| Population sigma | Unknown | Known |
| Distribution | t-distribution | Normal |
Frequently Asked Questions
What does the t-statistic mean?
The t-statistic measures how many standard errors the sample mean is from the hypothesized mean. Larger absolute values indicate stronger evidence against the null hypothesis.
Can I use a t-test with non-normal data?
The t-test is robust to moderate non-normality, especially with larger samples. For severely skewed data or small samples, consider transformation or non-parametric alternatives.
What if my groups have different sizes?
Unequal sample sizes are fine. Welch's t-test handles unequal sizes and variances appropriately.
Why use Welch's instead of Student's t-test?
Welch's t-test doesn't assume equal variances, making it more robust and applicable in most situations. It provides accurate results whether variances are equal or not.