How do I calculate the Adjusted R-squared score using scikit-learn?
Question:
I’m already using the r2_score function but don’t understand how I can get the “adjusted” R^2 score from this. The description at this page doesn’t mention it – maybe it’s the adjusted score by default?
Answers:
Adjusted R2 requires number of independent variables as well. That’s why it will not be calculated using such an independent metrics function (as we are not providing, how ypred was calculated).
However you can calculate the adjusted R2 from R2 with a simple formula given here
where n is number of observations in sample and p is number of independent variables in model
alternatively…
# adjusted R-squared
1 - ( 1-model.score(X, y) ) * ( len(y) - 1 ) / ( len(y) - X.shape[1] - 1 )
Simple calculation of Adj. R2
Adj_r2 = 1 - (1-r2_score(y, y_pred)) * (len(y)-1)/(len(y)-X.shape[1]-1)
The wikipedia page has been revised over the course of time in regards to this formula. To match the current state this would be the appropriate formula:
Adj r2 = 1-(1-R2)*(n-1)/(n-p)
with sklearn you could write some re-usable code such as :
import numpy as np
from sklearn.metrics import r2_score
def r2(actual: np.ndarray, predicted: np.ndarray):
""" R2 Score """
return r2_score(actual, predicted)
def adjr2(actual: np.ndarray, predicted: np.ndarray, rowcount: np.int, featurecount: np.int):
""" R2 Score """
return 1-(1-r2(actual,predicted))*(rowcount-1)/(rowcount-featurecount)
I’m already using the r2_score function but don’t understand how I can get the “adjusted” R^2 score from this. The description at this page doesn’t mention it – maybe it’s the adjusted score by default?
Adjusted R2 requires number of independent variables as well. That’s why it will not be calculated using such an independent metrics function (as we are not providing, how ypred was calculated).
However you can calculate the adjusted R2 from R2 with a simple formula given here
where n is number of observations in sample and p is number of independent variables in model
alternatively…
# adjusted R-squared
1 - ( 1-model.score(X, y) ) * ( len(y) - 1 ) / ( len(y) - X.shape[1] - 1 )
Simple calculation of Adj. R2
Adj_r2 = 1 - (1-r2_score(y, y_pred)) * (len(y)-1)/(len(y)-X.shape[1]-1)
The wikipedia page has been revised over the course of time in regards to this formula. To match the current state this would be the appropriate formula:
Adj r2 = 1-(1-R2)*(n-1)/(n-p)
with sklearn you could write some re-usable code such as :
import numpy as np
from sklearn.metrics import r2_score
def r2(actual: np.ndarray, predicted: np.ndarray):
""" R2 Score """
return r2_score(actual, predicted)
def adjr2(actual: np.ndarray, predicted: np.ndarray, rowcount: np.int, featurecount: np.int):
""" R2 Score """
return 1-(1-r2(actual,predicted))*(rowcount-1)/(rowcount-featurecount)