zhihuspider0.py 文件源码

python
阅读 21 收藏 0 点赞 0 评论 0

项目:ZhihuSpider 作者: AlexTan-b-z 项目源码 文件源码
def parse_answer(self,response):
        json_result = str(response.body,encoding="utf8").replace('false','0').replace('true','1')
        dict_result = eval(json_result)
        for one in dict_result['data']:
            item = AnswerItem()
            item['answer_user_id'] = response.meta['answer_user_id']
            item['answer_id'] = one['id']
            item['question_id'] = one['question']['id']
            #pdb.set_trace()
            item['cretated_time'] = one['created_time']
            item['updated_time'] = one['updated_time']
            item['voteup_count'] = one['voteup_count']
            item['comment_count'] = one['comment_count']
            item['content'] = one['content']
            yield item
        if dict_result['paging']['is_end'] == 0:
            offset = response.meta['offset'] + 20
            next_page = re.findall('(.*offset=)\d+',response.url)[0]
            yield Request(next_page + str(offset),callback=self.parse_answer,meta={'answer_user_id':response.meta['answer_user_id'],'offset':offset})
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号