page_mapreduce.py 文件源码

python
阅读 66 收藏 0 点赞 0 评论 0

项目:python-spider 作者: naginoasukara 项目源码 文件源码
def map_reduce(i, mapper, reducer):
        """
        map_reduce??
        :param i: ??MapReduce???
        :param mapper: ???mapper??
        :param reducer: ???reducer??
        :return: ????reducer??????????????
        """
        intermediate = []  # ?????(intermediate_key, intermediate_value)
        for (key, value) in i.items():
            intermediate.extend(mapper(key, value))

        # sorted????????list???list?????????tuple?key????tuple????????
        # groupby???????????????????,key????tuple?????????????????
        # ??????groupby???key?intermediate_key??group??list??1????
        # ????intermediate_key?(intermediate_key, intermediate_value)
        groups = {}
        for key, group in itertools.groupby(sorted(intermediate, key=lambda im: im[0]), key=lambda x: x[0]):
            groups[key] = [y for x, y in group]
        # groups???????key??????intermediate_key?value?????intermediate_key?intermediate_value
        # ???????
        return [reducer(intermediate_key, groups[intermediate_key]) for intermediate_key in groups]
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号