personal_db.py 文件源码

python
阅读 29 收藏 0 点赞 0 评论 0

项目:kaggle-review 作者: daxiongshu 项目源码 文件源码
def get_split(self):
        if self.split is not None:
            return
        name = "{}/split.p".format(self.flags.data_path)
        split = load_pickle(None,name,[])

        if len(split) == 0:
            #data = self.data["training_variants"].append(self.data["test_variants_filter"])
            data = self.data["training_variants"]
            y = data['Class']-1
            X = np.arange(y.shape[0])
            from sklearn.model_selection import StratifiedKFold
            skf = StratifiedKFold(n_splits=self.flags.folds,shuffle=True,random_state=99)
            split = [(train_index, test_index) for train_index, test_index in skf.split(X, y)]
            save_pickle(split,name)
            print("new shuffle")
        self.split = split
        #print("split va",split[0][1][:10])
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号