36kr.py 文件源码

python
阅读 26 收藏 0 点赞 0 评论 0

项目:lichking 作者: melonrun 项目源码 文件源码
def article_detail(aitem, response):
        for a_content in response.xpath('//script').extract():
            if a_content.find("detailArticle|post") == -1:
                continue
            a_content = a_content.split("props=")[1]
            a_content = a_content.split(",location")[0]
            a_content = json.loads(a_content).get("detailArticle|post")
            aitem.content = BeautifulSoup(a_content.get("content"), 'lxml').get_text()
            aitem.time = a_content.get('published_at')
            aitem.last_reply_time = aitem.time
            aitem.views = a_content.get('counters').get('view_count')
            aitem.replies = a_content.get('counters').get('comment')
            aitem.author = a_content.get('user').get('name')
            aitem.title = a_content.get('title')
            category_tags = json.loads(a_content.get('extraction_tags'))
            category = ''
            for category_tag in category_tags:
                category += category_tag[0] + ' '
            aitem.category = category

        return aitem
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号