combined.py 文件源码

python
阅读 27 收藏 0 点赞 0 评论 0

项目:ebay 作者: fgscivittaro 项目源码 文件源码
def dynamically_scrape_combined_data(data_filename,
                                     sales_filename,
                                     interval,
                                     num_retries = 10):
    """
    Dynamically scrapes a continuously updated list of unique clean links and
    appends the data to their respective files.
    """

    old_list = []

    def job(old_list):
        new_list = collect_all_featured_links()
        new_links = remove_old_links(old_list, new_list)
        bad_links = collect_bad_links(new_links)
        clean_links = remove_bad_links_from_link_list(bad_links, new_links)

        scrape_combined_data_from_all_featured_products(data_filename,
                                                        sales_filename,
                                                        clean_links,
                                                        num_retries)

        old_list = new_list

    job(old_list)
    schedule.every(interval).hours.do(job)

    while True:
        schedule.run_pending()
        time.sleep(30)

    print "Dynamic scraping finished"
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号