features_generator.py 文件源码

python
阅读 25 收藏 0 点赞 0 评论 0

项目:JData-algorithm-competition 作者: wrzto 项目源码 文件源码
def load_UCPair_action_cnt(start_date = '2016-02-01 00:00:00', end_date = '2016-04-16 00:00:00', actions=[1,2,3,4,5,6]):
    '''
    ??UCPair???
    '''
    dump_path = './cache/UCPair_action_cnt_{0}_{1}.pkl'.format(start_date[:10], end_date[:10])
    if os.path.exists(dump_path):
        with open(dump_path, 'rb') as f:
            df = pickle.load(f)
    else:
        df = get_action_data(start_date = start_date, end_date = end_date, field=['user_id', 'type', 'cate'])
        prefix = 'UCPair_action_cnt_{0}_{1}'.format(start_date[:10], end_date[:10])
        type_dummies = pd.get_dummies(df['type'], prefix=prefix)
        df = pd.concat([df, type_dummies], axis=1)
        df = df.groupby(['user_id', 'cate'], as_index=False).sum()
        with open(dump_path, 'wb') as f:
            pickle.dump(df, f)

    actions.sort()
    rt_cols = ['user_id', 'cate']
    rt_cols.extend(['UCPair_action_cnt_{0}_{1}_{2}'.format(start_date[:10], end_date[:10], i) for i in actions])
    df = df[rt_cols]

    return df
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号