JEE Main 2024 Question Paper Solution Discussion Live JEE Main 2024 Question Paper Solution Discussion Live

Difference between Covariance and Correlation

Covariance is a measure which shows the extent to which two random variables change in tandem. Correlation gives an indication of how variables are related. In this article, we come across the difference between covariance and correlation, definitions and its formulas. 

Covariance and correlation are two important concepts commonly used in statistics. These topics weigh the linear relationships in the variables. Correlation can be positive, negative or zero. If the correlation is

  • Positive: An increase in one of the variables results in an increase in the other.
  • Negative: The variables are in opposite directions.
  • Zero: Then, no relationship exists.

Positive negative and zero correlation

Covariance indicates the direction of linear relationships.

Covariance and Correlation – Definition and Formula

A subset of the population is called a sample. Correlation and covariance are calculated on samples and not populations termed as sample covariance and correlation. Both terms define the relationship and dependency between the variables.

Correlation measures the association between the variables.

Correlation formula

Covariance explains the joint variability of the variables.

Covariance formula

Where

xi = Data value of x

yi = Data value of y

x̄ = Mean of x

ȳ = Mean of y

N = Number of data values

Correlation versus Covariance

The function of covariance is a correlation. The values of correlation are standardised, but covariance values are not. The correlation coefficient can be obtained by dividing the covariance of the variables by the product of their standard deviation values. Standard deviation measures the variability of datasets absolutely. When it is divided by the standard deviation, it falls in the range of -1 to +1, which is the range of correlation values. The normalised form of covariance is a correlation.

In the formula of covariance, the units are assumed from the product of the units of the variables. Correlation is non-dimensional. It is a measure of the relationship between the variables. The covariance value is affected by the change of scale in the variables. If all the values of one variable are multiplied to a constant and all the values of the other variable are multiplied by a similar or a different constant, the covariance value changes. Doing the same, the correlation value is not affected by the change in the scale of the variables.

Correlation vs Covariance Comparative

Basis Covariance Correlation
Meaning Covariance indicates the extent of the variables being dependent on each other. A higher value denotes higher dependency. Correlation signifies the strength of association between the variables when the other things are constant.
Relationship Correlation can be gathered from covariance. Correlation gives the value of covariance on a standard scale.
Values Lie between -∞ and +∞ Correlation has limited values in the range of -1 and +1.
Scalability Affects covariance Correlation isn’t affected by a change in scale.
Units The covariance will have a definite unit, as it is concluded from the multiplication of numbers and their units. Correlation is a number without units but includes decimal values.

Correlation and Covariance for Standardised Attributes

It can be shown that the correlation between attributes is equal to the covariance of two standardised attributes. The first step to this is to standardise the two attributes, x and y, and obtain their z-scores [x’ and y’]. 

\(\begin{array}{l}x^{\prime}=\frac{x-\mu_{x}}{\sigma_{x}}, \quad y^{\prime}=\frac{y-\mu_{y}}{\sigma_{y}}\\\end{array} \)

The value of population covariance between the attributes is calculated using the formula

\(\begin{array}{l}\sigma_{x y}=\frac{1}{n} \sum_{i}^{n}\left(x^{(i)}-\mu_{x}\right)\left(y^{(i)}-\mu_{y}\right)\\\end{array} \)

As standardisation executes mean-centring, the above equation can be written as

\(\begin{array}{l}\sigma_{x y}^{\prime}=\frac{1}{n} \sum_{i}^{n}\left(x^{\prime (i)}-0\right)\left(y^{\prime (i)}-0\right)\\\end{array} \)

If these terms are substituted back using the concepts of standardised attributes, then 

\(\begin{array}{l}\begin{aligned} \frac{1}{n} \sum_{i}^{n}\left(\frac{x-\mu_{x}}{\sigma_{x}}\right)\left(\frac{y-\mu_{y}}{\sigma_{y}}\right) \\ = \frac{1}{n \cdot \sigma_{x} \sigma_{y}} \sum_{i}^{n}\left(x^{(i)}-\mu_{x}\right)\left(y^{(i)}-\mu_{y}\right), \end{aligned}\\\end{array} \)

On simplification,

\(\begin{array}{l}\sigma_{x y}^{\prime}=\frac{\sigma_{x y}}{\sigma_{x} \sigma_{v}}\end{array} \)

Hence, correlation and covariance are the same, if the attributes are standardised.

Also read

Statistics

Properties of Median

Solved Examples on Covariance and Correlation

Example 1: If the coefficient of correlation between x and y is 0.5 and their covariance is 16, and the SD of x is 4, then what is the SD of y?

Solution:

Given r = 0.5

Cov (x,y) = 16

σx = 4

σy = cov (x,y) / rσx

= 16 / 0.5 × 4

= 16 / (½) × (42) = 162

= 8

Example 2: If σx = σy and x, y are related by u = x + y; v = x − y, what is the cov(u,v)?

Solution:

\(\begin{array}{l}u=x+y\\v=x-y\\\Rightarrow \bar{u}=\bar{x}+\bar{y}\\\bar{v}=\bar{x}-\bar{y}\\u-\bar{u}=(x-\bar{x})+(y-\bar{y})\\v-\bar{v}=(x-\bar{x})-(y-\bar{y})\\(u-\bar{u})\cdot (v-\bar{v})=(x-\bar{x})^2-(y-\bar{y})^2\\\Rightarrow \frac{1}{n} \sum (u-\bar{u})(v-\bar{v})=\frac{1}{n}\sum (x-\bar{x})^2-\frac{1}{x}\sum (y-\bar{y})^2\\\end{array} \)

σx2 − σy2 = 0

⇒ cov(u,v)=0

Example 3: What is the correlation between x and a−x?

Solution:

Let u = a − x, and therefore,

Var (u) = Var (a−x)

=(−1)2 var (x)

= var (x)

= σ2

cov (x, a − x) = cov (x,u)

\(\begin{array}{l}r(x,u)=\frac{cov(x,y)}{\sqrt{var(x)var(u)}}\end{array} \)
\(\begin{array}{l}=-\frac{\sigma^2}{ \sqrt{\sigma^2,\sigma^2}}\end{array} \)

= -1

Example 3: If the correlation coefficient between x and y is 0.6, the covariance is 27, and the variance of y is 25, what is the variance of x?

Solution: 

r = 0.6

cov (x, y) = 27

\(\begin{array}{l}\sigma^{2}(y)=25 \Rightarrow \sigma(y) = 5\\ r = \frac{\text{covariance}(x, y)}{\sigma(x)\cdot \sigma(y)}\\ \Rightarrow \sigma(x) = \frac{\text{covariance}(x, y)}{r \cdot \sigma(y)} \\ = \frac{27}{\frac{6}{10}*5}\\ =\frac{27*2}{6}=9\\ \sigma^{2}(x)=81\\\end{array} \)

Example 4: If the covariance between x and y is 30, the variance of x is 25, and the variance of y is 144, find the correlation coefficient.

Solution: 

\(\begin{array}{l}cov(x,y)=30 \\ var(x)=25, var(y)=144 \\ r(x,y)=\frac{cov(x,y)}{\sqrt{var(x).var(y)}}\\ r(x,y)=\frac{30}{\sqrt{25*144}}\\ =\frac{30}{5*12}\\ =0.5\\\end{array} \)

Example 5: Let the correlation coefficient between X and Y be 0.6. Random variables Z and W are defined as  Z = X + 5 and W = (Y) / (3). What is the correlation coefficient between Z and W?

Solution: 

Given

\(\begin{array}{l}r_{xy}=0.6\\ z=X+5, w=\frac{Y}{3}\\ b_{zx}=1\Rightarrow b_{wy}=\frac{1}{3}\\ b_{zx} * b_{wy} = 1 * \frac{1}{3}\\ \frac{r_{zw}}{r_{xy}}=\frac{1}{3}\\ \Rightarrow r_{zw}=\frac{r_{xy}}{3}=\frac{0.6}{3}=0.2\end{array} \)

Frequently Asked Questions

Q1

What do you mean by correlation?

Correlation refers to the indication of how variables are related. It is the degree to which two or more variables are linearly related.

Q2

What do you mean by positive correlation?

A positive correlation denotes the relation between two variables that tend to move in the same direction. When one variable tends to increase with the increase in another variable, a positive correlation exists.

Q3

Give the formula for the correlation between X and Y.

Correlation between X and Y is given by Corr(X, Y) = cov(X, Y)/σXσY. Here, σX is the standard deviation of X, and σY is the standard deviation of Y.

Q4

What are the 3 types of correlation?

The three types of correlation are positive correlation, negative correlation and zero or no correlation.

Q5

Can correlation be negative?

Yes. Correlation can be negative, positive or zero.

Comments

Leave a Comment

Your Mobile number and Email id will not be published.

*

*