GridSearchCV: Scoring does not use the chosen XGBRegressor score method

Geir Inge :

Scikit-learn GridSearchCV is used for hyper parameter tuning of XGBRegressor models. Independent of specified eval_metric in XGBRegressor().fit() the same score values are produced by GridSearchCV. On https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html it says for the parameter scoring: "If None, the estimator’s score method is used." This does not happen. Always get same value. How can I get results corresponding to XGBRegressor eval_metric?

This sample code:

import numpy as np
from sklearn.model_selection import GridSearchCV, KFold
from sklearn.datasets import load_boston
import xgboost as xgb

rng = np.random.RandomState(31337)

boston = load_boston()
y = boston['target']
X = boston['data']

kf = KFold(n_splits=2, random_state=42)
folds = list(kf.split(X))

xgb_model = xgb.XGBRegressor(objective='reg:squarederror', verbose=False)
reg = GridSearchCV(estimator=xgb_model, 
                   param_grid= {'max_depth': [2], 'n_estimators': [50]}, 
                   cv=folds,
                   verbose=False)

reg.fit(X, y, **{'eval_metric': 'mae', 'verbose': False})
print('GridSearchCV mean(mae)?:  ', reg.cv_results_['mean_test_score'])
# -----------------------------------------------
reg.fit(X, y, **{'eval_metric': 'rmse', 'verbose': False})
print('GridSearchCV mean(rmse)?: ', reg.cv_results_['mean_test_score'])
print("----------------------------------------------------")

xgb_model.set_params(**{'max_depth': 2, 'n_estimators': 50})
xgb_model.fit(X[folds[0][0],:],y[folds[0][0]], eval_metric='mae', 
              eval_set = [(X[folds[0][0],:],y[folds[0][0]])], verbose=False)
print('XGBRegressor 0-mae:', xgb_model.evals_result()['validation_0']['mae'][-1])
xgb_model.fit(X[folds[0][1],:],y[folds[0][1]], eval_metric='mae', 
              eval_set = [(X[folds[0][1],:],y[folds[0][1]])], verbose=False)
print('XGBRegressor 1-mae:', xgb_model.evals_result()['validation_0']['mae'][-1])

xgb_model.fit(X[folds[0][0],:],y[folds[0][0]], eval_metric='rmse', 
              eval_set = [(X[folds[0][0],:],y[folds[0][0]])], verbose=False)
print('XGBRegressor 0-rmse:', xgb_model.evals_result()['validation_0']['rmse'][-1])
xgb_model.fit(X[folds[0][1],:],y[folds[0][1]], eval_metric='rmse', 
              eval_set = [(X[folds[0][1],:],y[folds[0][1]])], verbose=False)
print('XGBRegressor 1-rmse:', xgb_model.evals_result()['validation_0']['rmse'][-1])

returns (the numbers above the line should have been an average of those below the line)

GridSearchCV mean(mae)?:   [0.70941007]
GridSearchCV mean(rmse)?:  [0.70941007]
----------------------------------------------------
XGBRegressor 0-mae: 1.273626
XGBRegressor 1-mae: 1.004947
XGBRegressor 0-rmse: 1.647694
XGBRegressor 1-rmse: 1.290872

Sergey Bushmanov :

TL;DR: What you're returned is so called R2 or coefficient of determination. This is the default scoring metric for XGBRegressor score function that is picked up by GridSearchCV if scoring=None

Compare the results with explicitly coding scoring:

from sklearn.metrics import make_scorer, r2_score, mean_squared_error
xgb_model = xgb.XGBRegressor(objective='reg:squarederror', verbose=False)

reg = GridSearchCV(estimator=xgb_model, scoring=make_scorer(r2_score),
                   param_grid= {'max_depth': [2], 'n_estimators': [50]}, 
                   cv=folds,
                   verbose=False)

reg.fit(X, y)
reg.best_score_
0.7333542105472226

with those with scoring=None:

reg = GridSearchCV(estimator=xgb_model, scoring=None,
                   param_grid= {'max_depth': [2], 'n_estimators': [50]}, 
                   cv=folds,
                   verbose=False)

reg.fit(X, y)
reg.best_score_
0.7333542105472226

If you read GridSearchCV docstrings :

estimator : estimator object. This is assumed to implement the scikit-learn estimator interface. Either estimator needs to provide a score function, or scoring must be passed.

At this point you would want to check docs for xgb_model.score?:

Signature: xgb_model.score(X, y, sample_weight=None)
Docstring:
Return the coefficient of determination R^2 of the prediction.

So, with the help of those documents, if you do not like XGBRegressor's default R2 score function, provide your scoring function explicitly to GridSearchCV

E.g. if you want RMSE you may do:

reg = GridSearchCV(estimator=xgb_model,  
                   scoring=make_scorer(mean_squared_error, squared=False),
                   param_grid= {'max_depth': [2], 'n_estimators': [50]}, 
                   cv=folds,
                   verbose=False)

reg.fit(X, y)
reg.best_score_
4.618242594168436

GridSearchCV: Scoring does not use the chosen XGBRegressor score method

Guess you like