python multiprocessing pool.map用于多个参数

发布于 2021-02-02 23:22:51

Python多处理库中,是否存在pool.map的变体,它支持多个参数?

text = "test"
def harvester(text, case):
    X = case[0]
    text+ str(X)

if __name__ == '__main__':
    pool = multiprocessing.Pool(processes=6)
    case = RAW_DATASET
    pool.map(harvester(text,case),case, 1)
    pool.close()
    pool.join()
关注者
0
被浏览
152
1 个回答
  • 面试哥
    面试哥 2021-02-02
    为面试而生,有面试问题,就找面试哥。

    答案取决于版本和情况。JF Sebastian首先描述了最近版本的Python(从3.3开始)的最一般的答案。1它使用Pool.starmap方法,该方法接受一个参数元组序列。然后,它会自动将每个元组的参数解包,并将其传递给给定的函数:

    import multiprocessing
    from itertools import product
    
    def merge_names(a, b):
        return '{} & {}'.format(a, b)
    
    if __name__ == '__main__':
        names = ['Brown', 'Wilson', 'Bartlett', 'Rivera', 'Molloy', 'Opie']
        with multiprocessing.Pool(processes=3) as pool:
            results = pool.starmap(merge_names, product(names, repeat=2))
        print(results)
    
    # Output: ['Brown & Brown', 'Brown & Wilson', 'Brown & Bartlett', ...
    

    对于早期版本的Python,您需要编写一个辅助函数来显式解压缩参数。如果要使用with,则还需要编写一个包装器以变为Pool上下文管理器。(感谢muon指出这一点。)

    import multiprocessing
    from itertools import product
    from contextlib import contextmanager
    
    def merge_names(a, b):
        return '{} & {}'.format(a, b)
    
    def merge_names_unpack(args):
        return merge_names(*args)
    
    @contextmanager
    def poolcontext(*args, **kwargs):
        pool = multiprocessing.Pool(*args, **kwargs)
        yield pool
        pool.terminate()
    
    if __name__ == '__main__':
        names = ['Brown', 'Wilson', 'Bartlett', 'Rivera', 'Molloy', 'Opie']
        with poolcontext(processes=3) as pool:
            results = pool.map(merge_names_unpack, product(names, repeat=2))
        print(results)
    
    # Output: ['Brown & Brown', 'Brown & Wilson', 'Brown & Bartlett', ...
    

    在更简单的情况下,使用固定的第二个参数,您也可以使用partial,但仅在Python 2.7+中使用。

    import multiprocessing
    from functools import partial
    from contextlib import contextmanager
    
    @contextmanager
    def poolcontext(*args, **kwargs):
        pool = multiprocessing.Pool(*args, **kwargs)
        yield pool
        pool.terminate()
    
    def merge_names(a, b):
        return '{} & {}'.format(a, b)
    
    if __name__ == '__main__':
        names = ['Brown', 'Wilson', 'Bartlett', 'Rivera', 'Molloy', 'Opie']
        with poolcontext(processes=3) as pool:
            results = pool.map(partial(merge_names, b='Sons'), names)
        print(results)
    
    # Output: ['Brown & Sons', 'Wilson & Sons', 'Bartlett & Sons', ...
    


  • 面试哥
    面试哥 2021-02-02
    为面试而生,有面试问题,就找面试哥。

    是否有pool.map的变体,它支持多个参数?

    Python 3.3包含pool.starmap()方法:
    
    #!/usr/bin/env python3
    from functools import partial
    from itertools import repeat
    from multiprocessing import Pool, freeze_support
    
    def func(a, b):
        return a + b
    
    def main():
        a_args = [1,2,3]
        second_arg = 1
        with Pool() as pool:
            L = pool.starmap(func, [(1, 1), (2, 1), (3, 1)])
            M = pool.starmap(func, zip(a_args, repeat(second_arg)))
            N = pool.map(partial(func, b=second_arg), a_args)
            assert L == M == N
    
    if __name__=="__main__":
        freeze_support()
        main()
    

    对于旧版本:

    #!/usr/bin/env python2
    import itertools
    from multiprocessing import Pool, freeze_support
    
    def func(a, b):
        print a, b
    
    def func_star(a_b):
        """Convert `f([1,2])` to `f(1,2)` call."""
        return func(*a_b)
    
    def main():
        pool = Pool()
        a_args = [1,2,3]
        second_arg = 1
        pool.map(func_star, itertools.izip(a_args, itertools.repeat(second_arg)))
    
    if __name__=="__main__":
        freeze_support()
        main()
    

    输出

    1 1
    2 1
    3 1
    

    请注意此处的用法itertools.izip()itertools.repeat()用法。

    由于@unutbu提到的错误,您无法functools.partial()Python 2.6上使用或类似的功能,因此func_star()应明确定义简单的包装函数。又见解决方法 的建议uptimebox



知识点
面圈网VIP题库

面圈网VIP题库全新上线,海量真题题库资源。 90大类考试,超10万份考试真题开放下载啦

去下载看看