Three Ways to Find the "Center" of Data
When someone says "average," they usually mean the arithmetic mean. But in statistics, there are three main measures of central tendency โ and choosing the wrong one can give a misleading picture of your data.
- Mean โ the arithmetic average: sum all values and divide by count
- Median โ the middle value when data is sorted
- Mode โ the most frequently occurring value
Each captures a different aspect of "typical," and each has specific situations where it shines.
The Mean: What Most People Call "Average"
Definition
The mean is calculated by summing all data points and dividing by the number of points:
Example
Dataset: 12, 15, 18, 22, 33
When to Use the Mean
- When data is symmetric and has no extreme outliers
- When you need a value that accounts for every data point
- In scientific calculations, financial analysis (e.g., average monthly revenue), and statistical modeling
- When you'll use the result in further calculations (the mean has convenient mathematical properties)
When NOT to Use the Mean
The mean is highly sensitive to outliers and skewed distributions. Consider this salary dataset for a small company:
Salaries: $40K, $45K, $48K, $50K, $52K, $55K, $350K (the CEO)
Does $91,429 represent a "typical" salary at this company? Absolutely not โ 6 out of 7 employees earn less than $56K. The CEO's salary pulled the mean far from the center of the data.
The Median: The True Middle
Definition
The median is the middle value in a sorted dataset. If there's an even number of values, it's the average of the two middle values.
How to Find It
- Sort the data from smallest to largest
- If the count (n) is odd: the median is the value at position (n + 1) รท 2
- If n is even: the median is the average of values at positions n รท 2 and (n รท 2) + 1
Example (Odd Count)
Dataset: 33, 12, 22, 18, 15
Sorted: 12, 15, 18, 22, 33
Position: (5 + 1) รท 2 = 3rd value โ Median = 18
Example (Even Count)
Dataset: 33, 12, 22, 18, 15, 27
Sorted: 12, 15, 18, 22, 27, 33
Middle positions: 3rd and 4th values โ (18 + 22) รท 2 = Median = 20
The Salary Example Revisited
Sorted salaries: $40K, $45K, $48K, $50K, $52K, $55K, $350K
Median = $50,000
This is a much more accurate picture of the "typical" employee salary. The CEO's $350K doesn't distort the result at all.
When to Use the Median
- When data is skewed (long tail in one direction)
- When data contains outliers you don't want to dominate the result
- For income, housing prices, and wealth data โ all of which are right-skewed
- When you want the value that separates the top 50% from the bottom 50%
Calculate all three: Use the Mean, Median, Mode Calculator to instantly find the mean, median, and mode of any dataset.
The Mode: The Most Popular Value
Definition
The mode is the value that occurs most frequently in a dataset. A dataset can have:
- One mode (unimodal): one value appears more than any other
- Multiple modes (bimodal, multimodal): two or more values tie for highest frequency
- No mode: all values appear equally often
Example
Dataset: 4, 7, 7, 9, 12, 7, 15, 9
Value 7 appears 3 times (most frequent) โ Mode = 7
When to Use the Mode
- For categorical data where mean and median don't apply (e.g., "What's the most popular color?" โ the mode is the answer)
- When you need the most common value, not the "center"
- In business: most frequently ordered product, most common customer age group, peak traffic hour
- With discrete data that has natural repeats (shoe sizes, number of children)
When NOT to Use the Mode
- With continuous data that has few repeats (e.g., precise measurements: 12.347, 12.351, 12.349 โ likely no mode)
- When you need a measure of the "center" of metric data โ the mode can fall far from the center
- With small datasets where the mode can change dramatically with one additional data point
Side-by-Side Comparison
| Property | Mean | Median | Mode |
|---|---|---|---|
| Uses all data points | โ Yes | โ No | โ No |
| Affected by outliers | โ Heavily | โ Not at all | โ Not at all |
| Works for categories | โ No | โ No | โ Yes |
| Always unique | โ Yes | โ Yes (with convention) | โ Can have multiple |
| Good for skewed data | โ Misleading | โ Yes | โ ๏ธ Depends |
How Outliers Affect Each Measure
Let's see the impact with a concrete example:
Original dataset: 10, 12, 14, 15, 16, 18, 20
- Mean = 15.0
- Median = 15
- Mode = None (all unique)
Now add an outlier: 10, 12, 14, 15, 16, 18, 20, 200
- Mean = 38.1 (jumped from 15 to 38!)
- Median = 15.5 (barely changed)
- Mode = None
The mean more than doubled because of a single extreme value. The median moved by only 0.5. This is the fundamental reason why median is preferred for skewed data.
Real-World Scenarios
Household Income
The U.S. Census Bureau reports both mean and median household income. In 2023:
- Median household income: ~$80,610
- Mean household income: ~$114,000
The mean is about 40% higher because a small number of very high earners pull it upward. Policy discussions almost always reference the median because it better represents the typical household.
Housing Prices
Realtors quote median home prices, not averages. A few luxury homes can dramatically inflate the mean, making it useless for home buyers trying to understand what they'll actually pay.
Customer Ratings
If a product has ratings of 5, 5, 5, 5, 5, 1, 1 โ the mean is 3.86 but the mode is 5. The mode tells you that most buyers loved it; the low mean is driven by a couple of very unhappy customers. Both perspectives are useful.
Test Scores
Teachers often use the mean for grades because it accounts for all performance. But if a distribution is bimodal (a cluster of As and a cluster of Fs with few in between), the mean doesn't describe anyone's actual experience. The mode(s) would reveal the two groups.
How the Shape of Data Affects Your Choice
Symmetric Data (Normal Distribution)
When data is symmetric and bell-shaped, the mean, median, and mode are all approximately equal. Use the mean โ it's the most informative and has the best mathematical properties.
Right-Skewed Data (Positive Skew)
A long tail stretches to the right (high values). The mean is pulled right, away from the bulk of the data. The typical ordering is: Mode < Median < Mean.
Examples: income, housing prices, company sizes, website traffic
Best choice: Median
Left-Skewed Data (Negative Skew)
A long tail stretches to the left (low values). The mean is pulled left.
Examples: age at retirement, time to complete a task for skilled workers
Best choice: Median
Combining Measures for a Fuller Picture
Smart data analysis doesn't rely on a single number. Report multiple measures to give the complete picture:
- "The median home price in Austin is $425,000 (mean: $510,000)" โ the gap tells you the market is right-skewed with some expensive homes.
- "The modal shoe size sold is 10, with a mean of 9.8" โ the mode tells you what to stock most of; the mean confirms the center.
- "Average test score: 78 (median: 82, SD: 12)" โ the lower mean suggests some very low scores pulling it down. The median shows most students did well.
Key Takeaways
- Mean (average): Best for symmetric data without outliers. Uses all data points but is easily distorted.
- Median (middle value): Best for skewed data and when outliers are present. Robust and easy to interpret.
- Mode (most frequent): Best for categorical data and finding the most common value. The only option for non-numeric data.
- When in doubt, calculate all three โ their relative positions reveal the shape and symmetry of your data.
- For income, housing prices, and any right-skewed data, the median is almost always the better "average" to report.
Find all three instantly: Enter any dataset into the Mean, Median, Mode Calculator to see the mean, median, and mode with a breakdown of the calculation.
Related Calculator
Mean, Median, Mode Calculator
Put this guide into practice with our free online calculator.
Try the Mean, Median, Mode Calculator โ