In statistics, we have studied the classification of data into a grouped and ungrouped frequency distribution. These data can be pictorially represented using different graphs such as bar graphs, frequency polygons and histograms and so on. Also, we know that the three measures of central tendencies are mean, median and mode. In this article, we will discuss how to find the mean of the grouped data using different methods such as direct method, assumed mean method and step deviation method with many solved examples.

Table of Contents:

What is Meant by Mean in Statistics?

The mean or the average of the given observations is defined as the sum of the values of all the observations divided by the total number of observations. The mean of the data is generally represented by the notation x̄. If x₁, x₂, x₃, …x_n are the number of observations with respective frequencies f₁, f₂, f₃, … f_n, then

The sum of observations = f₁x₁+ f₂x₂ + f₃x₃ + ….+ f_nx_n.

The total number of observations = f₁+f₂+… + f_n.

Therefore, the mean of the data, x̄ = (f₁x₁+ f₂x₂ + f₃x₃ + ….+ f_nx_n)/ ( f₁+f₂+… + f_n).

In short, the above form can be represented using the summation (Σ).

\(\begin{array}{l}\bar{x}=\frac{\sum_{i=1}^{n}f_{i}x_{i}}{\sum_{i=1}^{n}f_{i}}\end{array} \)

Where, “i” varies from 1 to n.

Now, let us discuss how to find the mean of the given data using the above formula,

Example:

The marks scored by 30 students of class 10 of a certain school in the Maths paper consisting of 100 marks is given below in the tabular form. Find the mean of the marks obtained by the class 10 students.

Marks obtained (x_i)	10	20	36	40	50	56	60	70	72	80	88	92	95
Number of students (f_i)	1	1	3	4	3	2	4	4	1	1	2	3	1

Solution:

To find the mean of the marks obtained by the students in the Mathematics paper, we need to find the product of each x_i and their corresponding frequency f_i.

Marks Obtained (x_i)	Number of students (f_i)	f_ix_i
10	1	10
20	1	20
36	3	108
40	4	160
50	3	150
56	2	112
60	4	240
70	4	280
72	1	72
80	1	80
88	2	176
92	3	276
95	1	95
Total	Σf_i = 30	Σf_ix_i = 1779

Table 1

Thus, by using the formula,

\(\begin{array}{l}\bar{x}=\frac{\sum_{i=1}^{n}f_{i}x_{i}}{\sum_{i=1}^{n}f_{i}}\end{array} \)

, we get

x̄ = 1779/30

x̄ = 59.3

Hence, the mean of the marks obtained is 59.3.

Three Methods to Find the Mean of Grouped Data

In many scenarios, the data is large and to make a meaningful study, the data has to be condensed as grouped data. So, in those scenarios, we have to convert the ungrouped data into a grouped data and then find the mean. The three methods to find the mean of the grouped data is:

Direct Method
Assumed Mean Method
Step-deviation Method.

Now, let us discuss all these three methods one by one.

Direct Method

Consider the same example as given above.

Now, convert the ungrouped data into grouped data by forming a class interval of width 15.

Note, that while taking the frequencies to each class interval, students falling in the upper-class limit will be considered in the next class interval.

Therefore, the grouped frequency distribution table for the above-given example is as follows:

Class Interval	10-25	25-40	40-55	55-70	70-85	85-100
Number of Students	2	3	7	6	6	6

Now, for each class interval, we need to find the midpoint (classmark) that serves as the representative of the whole class.

For example, for the first class interval, 10-25, the class mark is:

Class Mark = (Upper class limit + lower class limit)/2

Class Mark = (25+10)/2 = 17.5

Similarly, find the classmark for all the intervals.

Therefore, the mean of the marks obtained by the students is given as:

Class Interval	Number of students (f_i)	Class Mark (x_i)	f_ix_i
10-25	2	17.5	35
25-40	3	32.5	97.5
40-55	7	47.5	332.5
55-70	6	62.5	375
70-85	6	77.5	465
85-100	6	92.5	555
Total	Σf_i = 30		Σf_ix_i = 1860

Table 2

Therefore, Mean, x̄ = 1860/30 = 62

The mean value obtained using the direct method is 62.

If you compare the mean obtained from Table 1 and Table 2, 59. 3 being the exact mean, whereas 62 is the approximate mean, because of the midpoint assumption in Table 2.

Assumed Mean Method

If the numerical values of x_iand f_iare large, finding the product of x_iand f_ibecomes a time-consuming process. To reduce the calculations, we can use the assumed mean method.

In this method, first, we need to choose the assumed mean, say “a” among the x_i, which lies in the centre. (If we consider the same example, we can choose either a = 47.5 or 62.5). Now, let us choose a = 47.5.

The second step is to find the difference (d_i) between each x_i and the assumed mean “a”.

The third step is to find the product of d_i with the corresponding f_i.

Class Interval	Number of students (f_i)	Class Mark (x_i)	d_i = x_i – 47.5	f_id_i
10-25	2	17.5	-30	-60
25-40	3	32.5	-15	-45
40-55	7	47.5	0	0
55-70	6	62.5	15	90
70-85	6	77.5	30	180
85-100	6	92.5	45	270
Total	Σf_i = 30			Σf_id_i = 435.

Table 3

Hence, the mean of the deviations obtained is,

\(\begin{array}{l}\bar{d}=\frac{\sum f_{i}d_{i}}{\sum f_{i}}\end{array} \)

As, the relationship between

\(\begin{array}{l}\bar{d}\end{array} \)

and

\(\begin{array}{l}\bar{x}\end{array} \)

= a +

\(\begin{array}{l}\bar{d}\end{array} \)

We can write

\(\begin{array}{l}\bar{x}=a+\frac{\sum f_{i}d_{i}}{\sum f_{i}}\end{array} \)

Now, substitute the values of a, Σf_i, and Σf_idi in the above formula to get the mean,

Therefore, x̄ = 47.5 + (435/30)

x̄ = 47.5 + 14.5

x̄ = 62.

Therefore, the mean of the marks obtained by the class 10 students is 62.

Hence, the result obtained from the direct method and assumed mean method is the same.

Step Deviation Method

Consider the same example as given above. In the step deviation method, we will add one more column to the table to find the mean, which is u_i = (x_i – a)/h

Where “a” is the assumed mean and “h” is the class size, which is equal to 15 (i.e) width of the class interval.

Class Interval	Number of students (f_i)	Class Mark (x_i)	d_i = x_i – 47.5 d_i = x_i – a	u_i =(x_i – a)/h (h=15)	f_iu_i
10-25	2	17.5	-30	-2	-4
25-40	3	32.5	-15	-1	-3
40-55	7	47.5	0	0	0
55-70	6	62.5	15	1	6
70-85	6	77.5	30	2	12
85-100	6	92.5	45	3	18
Total	Σf_i = 30				Σf_iu_i = 29

Table 4

Therefore, we obtained

\(\begin{array}{l}\bar{u}=\frac{\sum f_{i}u_{i}}{\sum f_{i}}\end{array} \)

The relation between

\(\begin{array}{l}\bar{u}\end{array} \)

and

\(\begin{array}{l}\bar{x}\end{array} \)

is:

\(\begin{array}{l}\bar{x}=a+h\frac{\sum f_{i}u_{i}}{\sum f_{i}}\end{array} \)

Now, substitute the values of a, h,Σf_i, and Σf_iui in the above formula to get the mean,

x̄ = 47.5 + 15(29/30)

x̄ = 47.5 + 15(0.967)

x̄= 47.5+ 14.5

x̄ = 62

Hence, the mean of the marks scored by the students = 62.

Therefore, the mean obtained by all three methods is the same.

Thus, we can say that the assumed mean method and the step deviation method are the simplified forms of the direct method.

(Note: In this example, the mean of the grouped data slightly differs from the mean of the ungrouped data because of the midpoint assumption).

Also, read:

Practice Problems

Consider the distribution of the daily wages of 50 employees of a factory as given below. Determine the mean of the daily wages of the workers of a factory using the approximate method.

Daily wages (in Rs) 500-520 520-540 540-560 560-580 580-600

Number of workers 12 14 8 6 10
The given table shows the expenditure on the food of 25 households in a locality. Find the mean of daily expenditure on food using a suitable method.

Daily expenditure (in Rs) 100-150 150-200 200-250 250-300 300-350

Number of households 4 5 12 2 2

Stay tuned with BYJU’S – The Learning App and learn all the Maths-related concepts easily by exploring more exciting videos.

Frequently Asked Questions on Mean of Grouped Data

What are the three methods used to find the mean of grouped data?

The three methods used to find the mean of the grouped data are:
Direct method
Assumed mean method
Step deviation method

What is meant by class-mark?

The classmark is also called the midpoint of the class intervals, which can be found by taking the average of its upper-class limit and lower-class limit.
Class Mark = (upper class limit + lower class limit)/2

Can we get the same mean value for both the grouped and ungrouped data?

The mean value of grouped data slightly differs from the ungrouped data because of the midpoint assumption.

What does “h” mean in the step-deviation method?

In the step-deviation method, “h” represents class size.

How to choose the value of “a” in the assumed mean method?

In the assumed mean method, the value of “a” can be chosen which lies in the centre of x₁, x₂, . . ., x_n.