nolearn用于多标签分类
我尝试使用从nolearn包导入的DBN函数,这是我的代码:
from nolearn.dbn import DBN
import numpy as np
from sklearn import cross_validation
fileName = 'data.csv'
fileName_1 = 'label.csv'
data = np.genfromtxt(fileName, dtype=float, delimiter = ',')
label = np.genfromtxt(fileName_1, dtype=int, delimiter = ',')
clf = DBN(
[data, 300, 10],
learn_rates=0.3,
learn_rate_decays=0.9,
epochs=10,
verbose=1,
)
clf.fit(data,label)
score = cross_validation.cross_val_score(clf, data, label,scoring='f1', cv=10)
print score
由于我的数据的形状为(1231,229),标签的形状为(1231,13),因此标签集看起来像([0 0 1 0 1 0 0 0 0 0 0 1 1 0]
…,[。 ..]),当我运行代码时,出现以下错误消息:输入形状错误(1231,13)。我想知道这里可能会发生两个问题:
- DBN不支持多标签分类
- 我的标签不适合在DBN适合功能中使用。
-
正如Francisco Vargas所提,
nolearn.dbn
已弃用,您应该改nolearn.lasagne
而使用(如果可以的话)。如果要在千层面中进行多标签分类,则应将
regression
参数设置为True
,以定义验证分数和自定义损失。这是一个例子:
import numpy as np import theano.tensor as T from lasagne import layers from lasagne.updates import nesterov_momentum from nolearn.lasagne import NeuralNet from nolearn.lasagne import BatchIterator from lasagne import nonlinearities # custom loss: multi label cross entropy def multilabel_objective(predictions, targets): epsilon = np.float32(1.0e-6) one = np.float32(1.0) pred = T.clip(predictions, epsilon, one - epsilon) return -T.sum(targets * T.log(pred) + (one - targets) * T.log(one - pred), axis=1) net = NeuralNet( # customize "layers" to represent the architecture you want # here I took a dummy architecture layers=[(layers.InputLayer, {"name": 'input', 'shape': (None, 1, 229, 1)}), (layers.DenseLayer, {"name": 'hidden1', 'num_units': 20}), (layers.DenseLayer, {"name": 'output', 'nonlinearity': nonlinearities.sigmoid, 'num_units': 13})], #because you have 13 outputs # optimization method: update=nesterov_momentum, update_learning_rate=5*10**(-3), update_momentum=0.9, max_epochs=500, # we want to train this many epochs verbose=1, #Here are the important parameters for multi labels regression=True, objective_loss_function=multilabel_objective, custom_score=("validation score", lambda x, y: np.mean(np.abs(x - y))) ) net.fit(X_train, labels_train)