In Statistical Analysis, the coefficient of determination method is used to predict and explain the future outcomes of a model. This method is also known as R squared. This method also acts like a guideline which helps in measuring the model’s accuracy. In this article, let us discuss the definition, formula, and properties of the coefficient of determination in detail.
Coefficient of Determination Definition
The coefficient of determination or R squared method is the proportion of the variance in the dependent variable that is predicted from the independent variable. It indicates the level of variation in the given data set.
- The coefficient of determination is the square of the correlation(r), thus it ranges from 0 to 1.
- With linear regression, the correlation of determination is equal to the square of the correlation between the x and y variables.
- If R2 is equal to 0, then the dependent variable should not be predicted from the independent variable.
- If R2 is equal to 1, then the dependent variable should be predicted from the independent variable without any error.
- If R2 is between 0 and 1, then it indicates the extent that the dependent variable can be predictable. If R2 of 0.10 means, it is 10 per cent of the variance in y variable is predicted from the x variable. If 0.20 means, it is 20 per cent of the variance is y variable is predicted from the x variable, and so on.
The value of R2 shows whether the model would be a good fit for the given data set. On the context of analysis, for any given per cent of the variation, it(good fit) would be different. For instance, in a few fields like rocket science, R2 is expected to be nearer to 100 %. But R2 = 0(minimum theoretical value), which might not be true as R2 is always greater than 0( by Linear Regression).
The value of R2 increases after adding a new variable predictor. Note that it might not be associated with the result or outcome. The R2 which was adjusted will include the same information as the original one. The number of predictor variables in the model gets penalized. When in a multiple linear regression model, new predictors are added, it would increase R2. Only an increase in R2 which is greater than the expected(chance alone), will increase the adjusted R2.
Following is the Regression line equation
p’ = aq + r
Where ‘p’ is the predicted function value of q. So, the method of checking how good the least-squares equation p̂ = aq + r will make a prediction of how p will be made.
Coefficient of Determination Formula
Properties of Coefficient of Determination
- It helps to get the ratio of how a variable which can be predicted from the other one, varies.
- If we want to check how clear it is to make predictions from the data given, we can determine the same by this measurement.
- It helps to find Explained variation / Total Variation
- It also lets us know the strength of the association(linear) between the variables.
- If the value of r2 gets close to 1, The values of y become close to the regression line and similarly if it goes close to 0, the values get away from the regression line.
- It helps in determining the strength of association between different variables.
Steps to Find the Coefficient of Determination
- Find r, Correlation Coefficient
- Square ‘r’.
- Change r to percentage.