imdb_crawl.py 文件源码

python
阅读 20 收藏 0 点赞 0 评论 0

项目:holcrawl 作者: shaypal5 项目源码 文件源码
def crawl_by_file(file_path, verbose, year=None):
    """Crawls IMDB and builds movie profiles for a movies in the given file."""
    results = {res_type : 0 for res_type in _result.ALL_TYPES}
    titles = _titles_from_file(file_path)
    if verbose:
        print("Crawling over all {} IMDB movies in {}...".format(
            len(titles), file_path))
    movie_pbar = tqdm(titles, miniters=1, maxinterval=0.0001,
                      mininterval=0.00000000001, total=len(titles))
    for title in movie_pbar:
        res = crawl_by_title(title, verbose, year, movie_pbar)
        results[res] += 1
    print("{} IMDB movie profiles crawled.".format(len(titles)))
    for res_type in _result.ALL_TYPES:
        print('{} {}.'.format(results[res_type], res_type))


# === uniting movie profiles to csv ===
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号