urllist.py 文件源码

python
阅读 27 收藏 0 点赞 0 评论 0

项目:icrawler 作者: hellock 项目源码 文件源码
def worker_exec(self, queue_timeout=2, **kwargs):
        while True:
            if self.signal.get('reach_max_num'):
                self.logger.info('downloaded image reached max num, thread %s'
                                 ' exit', threading.current_thread().name)
                break
            try:
                url = self.in_queue.get(timeout=queue_timeout)
            except queue.Empty:
                if self.signal.get('feeder_exited'):
                    self.logger.info('no more page urls to parse, thread %s'
                                     ' exit', threading.current_thread().name)
                    break
                else:
                    self.logger.info('%s is waiting for new page urls',
                                     threading.current_thread().name)
                    continue
            except Exception as e:
                self.logger.error('exception caught in thread %s: %s',
                                  threading.current_thread().name, e)
                continue
            else:
                self.logger.debug('start downloading page {}'.format(url))
            self.output({'file_url': url})
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号