# Mean and Variance

Mean and variance is a measure of central dispersion. Mean is the average of  given set of numbers. The average of the squared difference from the mean is the variance.

Central dispersion tells us how the data that we are taking for observation are scattered and distributed. We can know about different properties, but before doing that, we need to know about some of the features like mean, median and variance of the given data distribution.

If we multiply the observed values of a random variable by a constant t, its simple mean, sample standard deviation, and sample variance will be multiplied by t, |t| and t2, respectively. Also, if we add a constant m to the observed values of a random variable, that constant value will be added to sample mean, but the sample standard deviation and sample variance remain unchanged. Similar rule applies to the theoretical mean and variance of random variables.

 If Y = tX + m then $\mu_y = t \mu_x + m$ $\sigma^2_y = t^2 \sigma^2_x$ $\sigma_y = |t| \sigma_x$ Here, $\mu_x$, $\sigma^2_x$ and $\sigma_x$ are the mean, variance and standard deviation of the random variable X and $\mu_y$, $\sigma^2_y$ and $\sigma_y$ are the mean, variance and standard deviation of the random variable Y.

## Mean in Statistics

The term average of a random variable in probability and statistic is the mean or the expected value. If we know probability distribution for a random variable, we can also find its expected value. The mean of a random variable shows the location or the central tendency of the random variable.

The definition of ‘mean’ is different in different branches of mathematics. Normally, by mean we usually denote the average of the discrete data present in a set of numbers. The arithmetic mean is usually given by (This is the formula that we represent for ungrouped data)

 $\bar{x} = \frac{x_1+x_2+x_3+……+x_n}{n}$ Or $\bar{x} = \frac{1}{n}\sum_{i=1}^n x_i$ Where x1, x2, x3, ….., xn denote the value of the respective terms; And n = number of terms.

Let us take another example where each data point is given with separate frequency data.

The formula for the mean calculation in this case (called the discrete frequency data) is

 $\bar{x} = \frac{f_1x_1+ f_2x_2+ f_3x_3+……+ f_nx_n}{n}= \frac{1}{n}\sum_{i=1}^n f_ix_i$ Where x1, x2, x3, ….., xn denote the value of the respective terms; And f1, f2, f3, ….., fn denote the respective frequency data of the respective term; And n= number of terms.

The formula for both a sample and the population taken is the same, but the denotation different; sample mean is denoted by $\bar{x}$, and, the population mean represented by μ.

## Properties of Mean

Some properties of the mean are given by:
1. If we increase individual units by k, then the mean will increase by k.
2. If we decrease individual units by k, then the mean will decrease by k.
3. If we multiply each unit by k, then the mean will be multiplied by k.
4. If we divide each unit by k, then the mean will be divided by k.

## What is Variance in Statistics?

Sometimes we have to take the mean deviation by taking the absolute values from a set of values. The absolute values were taken to measure the deviations, as otherwise, the positive and negative deviation may cancel out each other.

So, to remove the sign of deviation, we usually take the variance of the data set, i.e. we usually square the deviation values. As squares are always positive, so the variance is always a positive number.

Let us take ”n” observations as a1, a2, a3,…..,an and their mean is represented by $\bar{a}$.

Then the variance is denoted by

$σ^2 = (a_1- \bar{a})^2 + (a_2-\bar{a})^2 + (a_3-\bar{a})^2….. + (a_n-\bar{a})^2=\sum_{i=1}^n (a_i-\bar{a})^2$.

## Variance of Random Variables in Probability and Statistics

Variance of a random variable shows the variability of the random variables. Variance represents the distance of a random variable from its mean.

It can be calculated bu using below formula:

σx2 = Var (X) = ∑i (xi − μ)2 p(xi) = E(X − μ)2

Var(X) = E(X2) − [E(X)]2

[E(X)]2 = [∑i xi p(xi)]2 = μ and E(X2) = ∑i xi2 p(xi).

## Properties of Variance

(1) If the variance is zero, this means that (ai$\bar{a}$) is equal to zero, which is nothing but each value of the set is equal to the mean value $\bar{a}$.

(2) If the variance is small, it means that the observations are pretty close to the mean value $\bar{a}$ and if the value is greater, the deviations of the observations are far from the mean value $\bar{a}$.

(3) If each observation is increased by ‘a’ where aϵR, then the variance will remain unchanged.

(4) If each observation is multiplied by ‘a’ where a ϵ R, then the variance will be multiplied by a2 also.

But for some data sets, the variance by the formula ∑(i=1)n (ai$\bar{a}$)2 does not give the proper values as the range of deviation may vary and the observations may be more scattered about the mean. So, to overcome this difficulty, we take the mean of the square of the deviations.
So, the variance is given by:
σ2 = 1/n ∑(i=1)n (ai$\bar{a}$)2.
As a result of squaring, the unit of variance is not the same as that of the data sets taken.

## Standard Deviation

What is Standard Deviation in Statistics

To take a proper measure of dispersion, we have to calculate the standard deviation by taking the square root of the variance. This measure often prevents above-average deviations from cancelling those below, which can sometimes contribute to a null variance. If the variance is greater, then the standard deviation will be more, and for lesser variance, the opposite case occurs.

The formula of standard deviation is given by:

$\sigma = \sqrt{\sigma^2} = \sqrt{\frac{1}{n} \sum_{i=1}^n (a_i- \bar{a})^2}$

Standard Deviation of distribution with discrete frequency:

It is given by:

$\sigma = \sqrt{\frac{1}{N} \sum_{i=1}^n f_i(a_i- \bar{a})^2}$

Where the values are: a1, a2, a3, ….., an

And the respective frequencies are: f1, f2, f3,…..,fn

And N = $\sum_{i=1}^n f_i$

## Solved Examples

Question 1:An experiment is conducted with 16 values of b, and the following results were obtained ∑ b2 = 2560 and ∑ b = 180. On checking through the data again, it is seen that one observation with a particular value 30 is replaced with 20. Then the corrected variance will be

Solution:

∑ b=2560 and ∑ b=180

So, ∑ b1 = 180 – 30 + 20 = 170

And the variance will be decreased by

∑ b2 = 900 – 400 = 500

The value of variance becomes ∑ (b1)2=2560-900+400=2060

So, the corrected variance will be = 1/n ∑ (b1)2 – [1/n ∑ b1]2 = 1/16 × 2060 – (1/16 × 170)2 = 128.75 – 112.890625 = 15.859375

Question 2:

Let us take two sets of values where one set is represented by the scores of 100 Indian batsmen, and the other represents the scores of 100 Australian batsmen. Incidentally, the Indians have scored runs in the order 550,551,552……649. And the Australian batsmen have scored runs in the order 900,901,902….999. If the variances of the two sets are represented by

σA and σB, then σAB is?

Solution:

We know, σ2 = (∑ di2)/n

Here, both the Australian and Indian Batsmen set have 100 consecutive positive integers and the value of n = 100, which is also the same. Thus, ∑ di2 is the same for both of these integer sets.

So, σAB =1

Question 3: Find the meam and variance of the new random variables if we are given with the mean and variance of the random variable X are 125 and 225 respectively.

Solution: The new random variable is the original random variable minus its mean.

Let Y be the another random variable, then

Y = X – 125

$\mu_Y = \mu_X – 125 = 125 – 125 = 0$ $\sigma^2_Y = \sigma^2_X =225$ $\sigma_Y = \sigma_X =15$

If we create a new variable Z = Y/15, which is obtained by dividing the random Y by its standard deviation. The mean, variance and standard deviation of this new variable are

$\mu_Z = \mu_Y/15 = 0/15 = 0$ $\sigma^2_Z = \sigma^2_Y/15^2 =225/225 = 1$

The new random variable Z has mean 0 and variance 1.