extractor.py 文件源码

python
阅读 39 收藏 0 点赞 0 评论 0

项目:hyperbolic-caching 作者: kantai 项目源码 文件源码
def load_articles(worker, num_procs = 64):
    input_file = "enwiki-20080103-pages-articles.xml.bz2"

    q = multiprocessing.JoinableQueue(25000)
    procs = []
    for i in range(num_procs):         
        procs.append( multiprocessing.Process(
            target=worker(q, talker = (i == 0))))
        procs[-1].daemon = True
        procs[-1].start()
    def make_article_callback(aid, t, pc):
        q.put((aid,t,pc))
    sys.stderr.write("starting...\n")
    process(input_file, cb = make_article_callback, lim = None)
    q.join()
    for p in procs:
        q.put( None )
    q.join()
    sys.stderr.write("\n")
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号