spider.py 文件源码

python

阅读 23 收藏 0 点赞 0 评论 0

项目：my_zhihu_spider 作者: MicroCountry 项目源码文件源码

def analy_following_profile(self,html_text):
        tree = html.fromstring(html_text)
        url_list = tree.xpath("//h2[@class='ContentItem-title']//span[@class='UserLink UserItem-name']//a[@class='UserLink-link']/@href")
        for target_url in url_list:
            target_url = "https://www.zhihu.com" + target_url
            target_url = target_url.replace("https", "http")
            if red.sadd('red_had_spider', target_url):
                red.lpush('red_to_spider', target_url)

评论列表正在加载评论...

文章目录

提
问题

写
面经

写
文章

微信
公众号

扫码关注公众号