GridSearchCV as @Gauthier Feuillen said is used to search best parameters of an estimator for given data.
Description of GridSearchCV:-
gcv = GridSearchCV(pipe, clf_params,cv=cv)
gcv.fit(features,labels)
clf_params
will be expanded to get all possible combinations separate using ParameterGrid.features
will now be split intofeatures_train
andfeatures_test
usingcv
. Same forlabels
- Now the gridSearch estimator (pipe) will be trained using
features_train
andlabels_inner
and scored usingfeatures_test
andlabels_test
. - For each possible combination of parameters in step 3, The steps 4 and 5 will be repeated for
cv_iterations
. The average of score across cv iterations will be calculated, which will be assigned to that parameter combination. This can be accessed usingcv_results_
attribute of gridSearch. - For the parameters which give the best score, the internal estimator will be re initialized using those parameters and refit for the whole data supplied into it(features and labels).
Because of last step, you are getting different scores in first and second approach. Because in the first approach, all data is used for training and you are predicting for that data only. Second approach has prediction on previously unseen data.