P-Value Calculator

Calculate p-values for hypothesis testing from z-scores, t-scores, chi-square, or F-statistics. Determine statistical significance instantly.

Quick Reference

Common Alpha Levels
0.05, 0.01, 0.10
95%, 99%, 90% confidence
Critical Z-Values
1.96, 2.58
Two-tailed at alpha 0.05, 0.01
Decision Rule
p < alpha = Reject H0
Statistically significant result
Two-Tailed vs One
p(two) = 2 x p(one)
For symmetric distributions

Your Results

Calculated
P-Value
0.0500
Probability
Significance
-
at alpha = 0.05
Test Statistic
1.96
z-score

Interpretation

The p-value represents the probability of observing results at least as extreme as the current results, assuming the null hypothesis is true.

Key Takeaways

  • A p-value measures the probability of obtaining results at least as extreme as observed, assuming the null hypothesis is true
  • If p < alpha, reject the null hypothesis (result is statistically significant)
  • Common significance level: alpha = 0.05 (5% chance of Type I error)
  • A small p-value does NOT mean the effect is large or practically important
  • P-values should be reported alongside effect sizes and confidence intervals

What Is a P-Value? A Complete Explanation

A p-value (probability value) is the probability of obtaining test results at least as extreme as the results actually observed, under the assumption that the null hypothesis is correct. In simpler terms, it tells you how likely your data would occur by random chance if there were no real effect.

The p-value is a cornerstone of frequentist statistical inference and is used across virtually all scientific disciplines to make decisions about whether observed effects are "real" or could simply be due to random sampling variation.

p-value = P(data | H0 is true)
P = Probability
data = Observed results or more extreme
H0 = Null hypothesis

How to Interpret P-Values

Step-by-Step Interpretation

1

Set Your Significance Level (Alpha)

Before collecting data, choose your alpha level (commonly 0.05). This is your threshold for "statistical significance" - the maximum p-value you'll accept to reject the null hypothesis.

2

Calculate Your Test Statistic

Compute your z-score, t-score, chi-square, or F-statistic from your sample data using the appropriate formula for your test.

3

Find the P-Value

Use our calculator or statistical tables to find the probability associated with your test statistic. This gives you the p-value.

4

Make Your Decision

Compare p-value to alpha: If p < alpha, reject H0 (statistically significant). If p ≥ alpha, fail to reject H0 (not statistically significant).

Common Significance Levels Explained

Alpha Level Confidence Level Z-Critical (Two-Tailed) Common Use
0.10 90% 1.645 Exploratory research
0.05 95% 1.960 Standard scientific research
0.01 99% 2.576 Medical/clinical studies
0.001 99.9% 3.291 Particle physics, genetics

Understanding Different Statistical Tests

Z-Test

Use when you know the population standard deviation and have a large sample size (n > 30). The z-test compares sample means to a known population mean.

T-Test

Use when the population standard deviation is unknown (estimated from sample). Common for comparing two groups or testing if a sample differs from a hypothesized mean. Requires degrees of freedom (df = n - 1 for one sample).

Chi-Square Test

Use for categorical data to test independence or goodness-of-fit. Degrees of freedom depend on the number of categories: df = (rows - 1)(columns - 1) for independence tests.

F-Test

Use to compare variances between groups (ANOVA) or test regression models. Requires two degrees of freedom values: df1 (numerator) and df2 (denominator).

Pro Tip: One-Tailed vs Two-Tailed Tests

Use a two-tailed test when you want to detect a difference in either direction (most common). Use a one-tailed test only when you have a specific directional hypothesis established before data collection. One-tailed tests have more power but can miss effects in the opposite direction.

Common P-Value Mistakes to Avoid

Misinterpretations to Avoid

Wrong: "P = 0.03 means there's a 3% chance the null hypothesis is true."

Right: "P = 0.03 means there's a 3% chance of seeing results this extreme if the null hypothesis were true."

  • Statistical significance is not practical significance - A tiny effect can be statistically significant with a large enough sample
  • p = 0.05 is not a magic threshold - p = 0.049 and p = 0.051 are virtually identical
  • Failing to reject H0 is not "accepting" it - Absence of evidence is not evidence of absence
  • Don't p-hack - Running multiple tests until you find significance inflates false positive rates
  • Report exact p-values - Say "p = 0.023" not just "p < 0.05"

P-Values and Effect Size

P-values tell you whether an effect exists, but not how large it is. Always report effect sizes alongside p-values:

  • Cohen's d - For comparing two means (small: 0.2, medium: 0.5, large: 0.8)
  • Pearson's r - For correlations (small: 0.1, medium: 0.3, large: 0.5)
  • Odds Ratio - For binary outcomes in logistic regression
  • Eta-squared - For ANOVA (proportion of variance explained)

Frequently Asked Questions

A p-value of 0.05 means that if the null hypothesis were true, there would be a 5% probability of obtaining results at least as extreme as what you observed. It does NOT mean there's a 5% chance the null hypothesis is true, or that there's a 95% chance your alternative hypothesis is true.

Use a two-tailed test when you want to detect a difference in either direction (most research scenarios). Use a one-tailed test only when: (1) you have a specific directional hypothesis before seeing data, (2) effects in the opposite direction are theoretically impossible or irrelevant, and (3) you're willing to ignore potentially important opposite effects.

A Type I error (false positive) occurs when you reject a true null hypothesis. The significance level (alpha) is the maximum probability of making a Type I error that you're willing to accept. If alpha = 0.05, you accept up to a 5% chance of falsely claiming an effect exists when it doesn't. P-values help control this error rate.

No. A p-value is a probability and must be between 0 and 1. If you get a value greater than 1, there's an error in your calculation. P-values close to 1 indicate your data is very consistent with the null hypothesis.

The 0.05 threshold is largely a historical convention established by Ronald Fisher in the 1920s. He suggested it as a convenient threshold for "significance." However, there's nothing magical about 0.05 - different fields use different standards (particle physics uses 0.0000003). The appropriate alpha depends on the costs of false positives vs false negatives in your specific context.

Degrees of freedom (df) depend on your test: One-sample t-test: df = n - 1. Two-sample t-test: df = n1 + n2 - 2 (or use Welch's approximation). Chi-square test of independence: df = (rows - 1) x (columns - 1). ANOVA: df1 = k - 1, df2 = N - k (where k = number of groups, N = total sample size).

A p-value tells you whether an effect is statistically significant, while a confidence interval tells you the range of plausible values for the effect size. They're related: if a 95% CI for a difference doesn't include zero, the p-value will be < 0.05. However, confidence intervals provide more information because they show the magnitude and precision of the estimate.