The skewness in statistics is a measure of asymmetry or the deviation of a given random variable’s distribution from a symmetric distribution (like normal Distribution).
In Normal Distribution, we know that: Median = Mode = Mean
Skewness in statistics can be divided into two categories. They are:
- Positive Skewness
- Negative Skewness
Positive Skewness
The extreme data values are higher in a positive skew distribution, which increases the mean value of the data set. To put it another way, a positive skew distribution has the tail on the right side.
It means that, Mean > Median > Mode in positive skewness
Negative Skewness
The extreme data values are smaller in negative skewness, which lowers the dataset’s mean value. A negative skew distribution is one with the tail on the left side.
Hence, in negative Skewness, Mean <Â Median < Mode.
Skewness Formula in Statistics
The skewness formula is called so because the graph plotted is displayed in a skewed manner. Skewness is a measure used in statistics that helps reveal the asymmetry of a probability distribution. It can either be positive or negative, irrespective of the signs. To calculate the skewness, we have to first find the mean and variance of the given data.
The skewness formula is given by:
Where,
n is the total number of observations
s is the standard deviation
g= sample skewness
Solved Example
Question. Find the skewness in the following data.Â
Height (inches) | Class Marks | Frequency |
59.5 – 62.5 | 61 | 5 |
62.5 – 65.5 | 64 | 18 |
65.5 – 68.5 | 67 | 42 |
68.5 – 71.5 | 70 | 27 |
71.5 – 74.5 | 73 | 8 |
To know how skewed these data are as compared to other data sets, we have to compute the skewness.
Sample size and sample mean should be found out.
N = 5 + 18 + 42 + 27 + 8 = 100
Now with the mean, we can compute the skewness.
Class Mark, x | Frequency, f | xf | \(\begin{array}{l}\left(x-\overline{x}\right)\end{array} \) Â |
\(\begin{array}{l}\left(x-\overline{x}\right)^{2}\times f\end{array} \) |
\(\begin{array}{l}\left(x-\overline{x}\right)^{3}\times f\end{array} \) Â |
61 | 5 | 305 | -6.45 | 208.01 | -1341.68 |
64 | 18 | 1152 | -3.45 | 214.25 | -739.15 |
67 | 42 | 2814 | -0.45 | 8.51 | -3.83 |
70 | 27 | 1890 | 2.55 | 175.57 | 447.70 |
73 | 8 | 584 | 5.55 | 246.42 | 1367.63 |
6745 | n/a | 852.75 | -269.33 | ||
67.45 | n/a | 8.5275 | -2.6933 |
Now, the skewness is
s=√[(8.5275/(100-1))=0.2935]
g=√[(-2.693/[99 * (0.295)3] = -1.038
For interpreting we have the following rules as per Bulmer in the year 1979:
- If the skewness comes to less than -1 or greater than +1, the data distribution is highly skewed
- If the skewness comes to between -1 and -1/2Â or between 1/2Â and +1, the data distribution is moderately skewed.
- If the skewness is between -1/2 and 1/2, the distribution is approximately symmetric
Comments