Covariance is a measure which shows the extent to which two random variables change in tandem. Correlation gives an indication of how variables are related. In this article, we come across the difference between covariance and correlation, definitions and its formulas.Â
Covariance and correlation are two important concepts commonly used in statistics. These topics weigh the linear relationships in the variables. Correlation can be positive, negative or zero. If the correlation is
- Positive: An increase in one of the variables results in an increase in the other.
- Negative: The variables are in opposite directions.
- Zero: Then, no relationship exists.
Covariance indicates the direction of linear relationships.
Covariance and Correlation – Definition and Formula
A subset of the population is called a sample. Correlation and covariance are calculated on samples and not populations termed as sample covariance and correlation. Both terms define the relationship and dependency between the variables.
Correlation measures the association between the variables.
Covariance explains the joint variability of the variables.
Where
xi = Data value of x
yi = Data value of y
x̄ = Mean of x
ȳ = Mean of y
N = Number of data values
Correlation versus Covariance
The function of covariance is a correlation. The values of correlation are standardised, but covariance values are not. The correlation coefficient can be obtained by dividing the covariance of the variables by the product of their standard deviation values. Standard deviation measures the variability of datasets absolutely. When it is divided by the standard deviation, it falls in the range of -1 to +1, which is the range of correlation values. The normalised form of covariance is a correlation.
In the formula of covariance, the units are assumed from the product of the units of the variables. Correlation is non-dimensional. It is a measure of the relationship between the variables. The covariance value is affected by the change of scale in the variables. If all the values of one variable are multiplied to a constant and all the values of the other variable are multiplied by a similar or a different constant, the covariance value changes. Doing the same, the correlation value is not affected by the change in the scale of the variables.
Correlation vs Covariance Comparative
Basis | Covariance | Correlation | ||
Meaning | Covariance indicates the extent of the variables being dependent on each other. A higher value denotes higher dependency. | Correlation signifies the strength of association between the variables when the other things are constant. | ||
Relationship | Correlation can be gathered from covariance. | Correlation gives the value of covariance on a standard scale. | ||
Values | Lie between -∞ and +∞ | Correlation has limited values in the range of -1 and +1. | ||
Scalability | Affects covariance | Correlation isn’t affected by a change in scale. | ||
Units | The covariance will have a definite unit, as it is concluded from the multiplication of numbers and their units. | Correlation is a number without units but includes decimal values. |
Correlation and Covariance for Standardised Attributes
It can be shown that the correlation between attributes is equal to the covariance of two standardised attributes. The first step to this is to standardise the two attributes, x and y, and obtain their z-scores [x’ and y’].Â
The value of population covariance between the attributes is calculated using the formula
As standardisation executes mean-centring, the above equation can be written as
If these terms are substituted back using the concepts of standardised attributes, thenÂ
On simplification,
Hence, correlation and covariance are the same, if the attributes are standardised.
Also read
Solved Examples on Covariance and Correlation
Example 1: If the coefficient of correlation between x and y is 0.5 and their covariance is 16, and the SD of x is 4, then what is the SD of y?
Solution:
Given r = 0.5
Cov (x,y) = 16
σx = 4
σy = cov (x,y) / rσx
= 16 / 0.5 × 4
= 16 / (½) × (42) = 162
= 8
Example 2: If σx = σy and x, y are related by u = x + y; v = x − y, what is the cov(u,v)?
Solution:
σx2 − σy2 = 0
⇒ cov(u,v)=0
Example 3: What is the correlation between x and a−x?
Solution:
Let u = a − x, and therefore,
Var (u) = Var (a−x)
=(−1)2 var (x)
= var (x)
= σ2
cov (x, a − x) = cov (x,u)
= -1
Example 3: If the correlation coefficient between x and y is 0.6, the covariance is 27, and the variance of y is 25, what is the variance of x?
Solution:Â
r = 0.6
cov (x, y) = 27
Example 4: If the covariance between x and y is 30, the variance of x is 25, and the variance of y is 144, find the correlation coefficient.
Solution:Â
Example 5: Let the correlation coefficient between X and Y be 0.6. Random variables Z and W are defined as Z = X + 5 and W = (Y) / (3). What is the correlation coefficient between Z and W?
Solution:Â
Given
Frequently Asked Questions
What do you mean by correlation?
Correlation refers to the indication of how variables are related. It is the degree to which two or more variables are linearly related.
What do you mean by positive correlation?
A positive correlation denotes the relation between two variables that tend to move in the same direction. When one variable tends to increase with the increase in another variable, a positive correlation exists.
Give the formula for the correlation between X and Y.
Correlation between X and Y is given by Corr(X, Y) = cov(X, Y)/σXσY. Here, σX is the standard deviation of X, and σY is the standard deviation of Y.
What are the 3 types of correlation?
The three types of correlation are positive correlation, negative correlation and zero or no correlation.
Can correlation be negative?
Yes. Correlation can be negative, positive or zero.
Comments