Understanding the ANOVA
Probability distributions describe the shape of random variation — how likely each possible outcome is. Choosing the right distribution for your data is the first step in statistical modeling.
When to use this distribution
- Normal distribution: heights, measurement errors, many natural phenomena. Symmetric, bell-shaped. Fully described by mean and standard deviation.
- Binomial: number of successes in a fixed number of independent trials with constant probability (coin flips, defect rates).
- Poisson: number of events in a fixed interval when events are independent and average rate is known (calls per hour, defects per unit length).
- Negative binomial: number of trials needed to achieve a fixed number of successes — the "inverse" of the binomial in a sense.
Reading the output
- PMF/PDF: probability of exactly x (discrete) or the density at x (continuous — must integrate over an interval for probability)
- CDF: probability of x or less. Subtract CDF values to get probability in a range.
- Quantiles: the value below which a given fraction of the distribution falls. The 95th percentile is the value x where P(X ≤ x) = 0.95.