How to interpret Quartile results
Descriptive statistics summarize a dataset's center, spread, and shape. They're the first step in understanding any quantitative dataset.
Measures of center
- Mean: sensitive to outliers. One extreme value pulls it dramatically. Best for symmetric distributions.
- Median: resistant to outliers. Better for skewed distributions (income, home prices) or datasets with extreme values.
- Mode: most frequent value. Useful for categorical data and multimodal distributions.
Measures of spread
- Standard deviation: average distance from the mean. For a normal distribution, ≈68% of values fall within 1 SD, 95% within 2 SD.
- IQR (Interquartile Range): Q3 − Q1. The range containing the middle 50% of data. Resistant to outliers — used in box plots.
- Range: max − min. Extremely sensitive to outliers; rarely the best spread measure.
Identifying outliers
A common rule: a value is an outlier if it's more than 1.5 × IQR below Q1 or above Q3. For z-score-based detection, values beyond ±3 are flagged (occur in <0.3% of a normal distribution).