middleware.py 文件源码

python
阅读 15 收藏 0 点赞 0 评论 0

项目:ahmia-crawler 作者: ahmia 项目源码 文件源码
def process_request(self, request, spider): # pylint:disable=unused-argument
        """Process incoming request."""
        hostname = urlparse(request.url).hostname
        if len(hostname.split(".")) > 4:
            # Do not execute this request
            request.meta['proxy'] = ""
            msg = "Ignoring request {}, too many sub domains." \
                  .format(request.url)
            logging.info(msg)
            raise IgnoreRequest()
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号