Empirical Relationship Between the Mean, Median and Mode
Within sets of data, there are a variety of descriptive statistics. The mean, median and mode all give measures of the centerof the data, but they calculate this in different ways:
On the surface, it would appear that there is no connection between these three numbers. However, it turns out that there is an empirical relationship between these measures of center.
Theoretical vs. EmpiricalBefore we go on, it is important to understand what we are talking about when we refer to an empirical relationship and contrast this with theoretical studies. Some results in statistics and other fields of knowledge can be derived from some previous statements in a theoretical manner. We begin with what we know, and then use logic, mathematics, and deductive reasoning and see where this leads us. The result is a direct consequence of other known facts.
Contrasting with the theoretical is the empirical way of acquiring knowledge. Rather than reasoning from already established principles, we can observe the world around us.
From these observations, we can then formulate an explanation of what we have seen. Much of science is done in this manner. Experiments give us empirical data. The goal then becomes to formulate an explanation that fits all of the data.
Empirical RelationshipIn statistics, there is a relationship between the mean, median and mode that is empirically based.
Observations of countless data sets have shown that most of the time the difference between the mean and the mode is three times the difference between the mean and the median. This relationship in equation form is:
Mean – Mode = 3(Mean – Median).
ExampleTo see the above relationship with real world data, let’s take a look at the U.S. state populations in 2010. In millions, the populations were: California - 36.4, Texas - 23.5, New York - 19.3, Florida - 18.1, Illinois - 12.8, Pennsylvania - 12.4, Ohio - 11.5, Michigan - 10.1, Georgia - 9.4, North Carolina - 8.9, New Jersey - 8.7, Virginia - 7.6, Massachusetts - 6.4, Washington - 6.4, Indiana - 6.3, Arizona - 6.2, Tennessee - 6.0, Missouri - 5.8, Maryland - 5.6, Wisconsin - 5.6, Minnesota - 5.2, Colorado - 4.8, Alabama - 4.6, South Carolina - 4.3, Louisiana - 4.3, Kentucky - 4.2, Oregon - 3.7, Oklahoma - 3.6, Connecticut - 3.5, Iowa - 3.0, Mississippi - 2.9, Arkansas - 2.8, Kansas - 2.8, Utah - 2.6, Nevada - 2.5, New Mexico - 2.0, West Virginia - 1.8, Nebraska - 1.8, Idaho - 1.5, Maine - 1.3, New Hampshire - 1.3, Hawaii - 1.3, Rhode Island - 1.1, Montana - .9, Delaware - .9, South Dakota - .8, Alaska - .7, North Dakota - .6, Vermont - .6, Wyoming - .5
The mean population is 6.0 million. The median population is 4.25 million. The mode is 1.3 million. Now we will calculate the differences from the above:
While these two differences numbers do not match exactly, they are relatively close to one another.
ApplicationThere are a couple of applications for the above formula. Suppose that we do not have a list of data values, but do know any two of the mean, median or mode. The above formula could be used to estimate the third unknown quantity.
For instance, if we know that we have a mean of 10, a mode of 4, what is the median of our data set? Since Mean – Mode = 3(Mean – Median), we can say that 10 – 4 = 3(10 – Median).