TextClf.py 文件源码-python代码片段

TextClf.py 文件源码

python

阅读 28 收藏 0 点赞 0 评论 0

项目：Chinese_text_classifier 作者: swordLong 项目源码文件源码

def split_dataset(data_set,split=0.5):
    '''
    According to 'spilt',split the dataset to train_set and test_set
    :param data_set: a Bunch object
    :param split: integer
    :return: x_train, x_test, y_train, y_test:Training data and target values
    '''
    print('spilting dataset......')
    start_time = time.time()
    x_train, x_test, y_train, y_test = cross_validation.train_test_split(data_set.data, data_set.target,
                                                                       test_size=split, random_state=0)
    print('spilting took %.2f s' % (time.time() - start_time))
    # train_set=(x_train,y_train)
    # test_set=(x_test,y_test)
    # return train_set,test_set
    return x_train, x_test, y_train, y_test