Scikit学习TypeError:如果未指定任何评分,则传递的估算器应具有“评分”方法

发布于 2021-01-29 17:08:38

我已经使用scikit-learn在python中创建了一个自定义模型,并且我想使用交叉验证。

该模型的类定义如下:

class MultiLabelEnsemble:
''' MultiLabelEnsemble(predictorInstance, balance=False)
    Like OneVsRestClassifier: Wrapping class to train multiple models when 
    several objectives are given as target values. Its predictor may be an ensemble.
    This class can be used to create a one-vs-rest classifier from multiple 0/1 labels
    to treat a multi-label problem or to create a one-vs-rest classifier from
    a categorical target variable.
    Arguments:
        predictorInstance -- A predictor instance is passed as argument (be careful, you must instantiate
    the predictor class before passing the argument, i.e. end with (), 
    e.g. LogisticRegression().
        balance -- True/False. If True, attempts to re-balance classes in training data
        by including a random sample (without replacement) s.t. the largest class has at most 2 times
    the number of elements of the smallest one.
    Example Usage: mymodel =  MultiLabelEnsemble (GradientBoostingClassifier(), True)'''

def __init__(self, predictorInstance, balance=False):
    self.predictors = [predictorInstance]
    self.n_label = 1
    self.n_target = 1
    self.n_estimators =  1 # for predictors that are ensembles of estimators
    self.balance=balance

def __repr__(self):
    return "MultiLabelEnsemble"

def __str__(self):
    return "MultiLabelEnsemble : \n" + "\tn_label={}\n".format(self.n_label) + "\tn_target={}\n".format(self.n_target) + "\tn_estimators={}\n".format(self.n_estimators) + str(self.predictors[0])

def fit(self, Xtrain, Ytrain):
    if len(Ytrain.shape)==1: 
        Ytrain = np.array([Ytrain]).transpose() # Transform vector into column matrix
        # This is NOT what we want: Y = Y.reshape( -1, 1 ), because Y.shape[1] out of range
    self.n_target = Ytrain.shape[1]                # Num target values = num col of Y
    self.n_label = len(set(Ytrain.ravel()))        # Num labels = num classes (categories of categorical var if n_target=1 or n_target if labels are binary )
    # Create the right number of copies of the predictor instance
    if len(self.predictors)!=self.n_target:
        predictorInstance = self.predictors[0]
        self.predictors = [predictorInstance]
        for i in range(1,self.n_target):
            self.predictors.append(copy.copy(predictorInstance))
    # Fit all predictors
    for i in range(self.n_target):
        # Update the number of desired prodictos
        if hasattr(self.predictors[i], 'n_estimators'):
            self.predictors[i].n_estimators=self.n_estimators
        # Subsample if desired
        if self.balance:
            pos = Ytrain[:,i]>0
            neg = Ytrain[:,i]<=0
            if sum(pos)<sum(neg): 
                chosen = pos
                not_chosen = neg
            else: 
                chosen = neg
                not_chosen = pos
            num = sum(chosen)
            idx=filter(lambda(x): x[1]==True, enumerate(not_chosen))
            idx=np.array(zip(*idx)[0])
            np.random.shuffle(idx)
            chosen[idx[0:min(num, len(idx))]]=True
            # Train with chosen samples            
            self.predictors[i].fit(Xtrain[chosen,:],Ytrain[chosen,i])
        else:
            self.predictors[i].fit(Xtrain,Ytrain[:,i])
    return

def predict_proba(self, Xtrain):
    if len(Xtrain.shape)==1: # IG modif Feb3 2015
        X = np.reshape(Xtrain,(-1,1))   
    prediction = self.predictors[0].predict_proba(Xtrain)
    if self.n_label==2:                 # Keep only 1 prediction, 1st column = (1 - 2nd column)
        prediction = prediction[:,1]
    for i in range(1,self.n_target): # More than 1 target, we assume that labels are binary
        new_prediction = self.predictors[i].predict_proba(Xtrain)[:,1]
        prediction = np.column_stack((prediction, new_prediction))
    return prediction

当我这样调用此类进行交叉验证时:

kf = cross_validation.KFold(len(Xtrain), n_folds=10)
score = cross_val_score(self.model, Xtrain, Ytrain, cv=kf, n_jobs=-1).mean()

我收到以下错误:

TypeError:如果未指定任何评分,则传递的估算器应具有“评分”方法。估计器MultiLabelEnsemble不。

如何创建评分方法?

关注者
0
被浏览
51
1 个回答
  • 面试哥
    面试哥 2021-01-29
    为面试而生,有面试问题,就找面试哥。

    使错误走开最简单的方法是通过scoring="accuracy"scoring="hamming"cross_val_score。该cross_val_score函数本身不知道您要解决的问题是什么,因此它也不知道什么是合适的指标。看来您正在尝试进行多标签分类,所以也许您想使用汉明损失?

    您还可以score按照“滚动您自己的估算器”文档中的说明实现方法,该方法具有签名 def score(self, X, y_true)。参见http://scikit-learn.org/stable/developers/#different-
    objects

    顺便说一句,您确实了解OneVsRestClassifier,对吗?看起来有点像您在重新实现它。



知识点
面圈网VIP题库

面圈网VIP题库全新上线,海量真题题库资源。 90大类考试,超10万份考试真题开放下载啦

去下载看看