GridSearch在OneVsRestClassifier中查找一个估算器

发布于 2021-01-29 17:07:06

我想在SVC模型中执行GridSearchCV,但是使用了“一对多”策略。对于后一部分,我可以这样做:

model_to_set = OneVsRestClassifier(SVC(kernel="poly"))

我的问题是参数。假设我想尝试以下值:

parameters = {"C":[1,2,4,8], "kernel":["poly","rbf"],"degree":[1,2,3,4]}

为了执行GridSearchCV,我应该做类似的事情:

 cv_generator = StratifiedKFold(y, k=10)
 model_tunning = GridSearchCV(model_to_set, param_grid=parameters, score_func=f1_score, n_jobs=1, cv=cv_generator)

但是,然后执行它,我得到:

Traceback (most recent call last):
  File "/.../main.py", line 66, in <module>
    argclass_sys.set_model_parameters(model_name="SVC", verbose=3, file_path=PATH_ROOT_MODELS)
  File "/.../base.py", line 187, in set_model_parameters
    model_tunning.fit(self.feature_encoder.transform(self.train_feats), self.label_encoder.transform(self.train_labels))
  File "/usr/local/lib/python2.7/dist-packages/sklearn/grid_search.py", line 354, in fit
    return self._fit(X, y)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/grid_search.py", line 392, in _fit
    for clf_params in grid for train, test in cv)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 473, in __call__
    self.dispatch(function, args, kwargs)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 296, in dispatch
    job = ImmediateApply(func, args, kwargs)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 124, in __init__
    self.results = func(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/grid_search.py", line 85, in fit_grid_point
    clf.set_params(**clf_params)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/base.py", line 241, in set_params
    % (key, self.__class__.__name__))
ValueError: Invalid parameter kernel for estimator OneVsRestClassifier

基本上,由于SVC在OneVsRestClassifier中,并且这是我发送给GridSearchCV的估计量,因此无法访问SVC的参数。

为了完成我想要的,我看到了两种解决方案:

  1. 创建SVC时,以某种方式告诉它不要使用一对一策略,而要使用一对一策略。
  2. 以某种方式指示GridSearchCV,该参数对应于OneVsRestClassifier中的估计器。

我尚未找到一种方法来做上述提到的任何替代方法。你知道有办法做任何一个吗?或者,也许您可​​以建议另一种获得相同结果的方法?

谢谢!

关注者
0
被浏览
53
1 个回答
  • 面试哥
    面试哥 2021-01-29
    为面试而生,有面试问题,就找面试哥。

    当您将嵌套估计量与网格搜索一起使用时,可以将参数的范围__作为分隔符。在这种情况下,SVC模型存储为estimatorOneVsRestClassifier模型内部命名的属性:

    from sklearn.datasets import load_iris
    from sklearn.multiclass import OneVsRestClassifier
    from sklearn.svm import SVC
    from sklearn.grid_search import GridSearchCV
    from sklearn.metrics import f1_score
    
    iris = load_iris()
    
    model_to_set = OneVsRestClassifier(SVC(kernel="poly"))
    
    parameters = {
        "estimator__C": [1,2,4,8],
        "estimator__kernel": ["poly","rbf"],
        "estimator__degree":[1, 2, 3, 4],
    }
    
    model_tunning = GridSearchCV(model_to_set, param_grid=parameters,
                                 score_func=f1_score)
    
    model_tunning.fit(iris.data, iris.target)
    
    print model_tunning.best_score_
    print model_tunning.best_params_
    

    产生:

    0.973290762737
    {'estimator__kernel': 'poly', 'estimator__C': 1, 'estimator__degree': 2}
    


知识点
面圈网VIP题库

面圈网VIP题库全新上线,海量真题题库资源。 90大类考试,超10万份考试真题开放下载啦

去下载看看