sightcrawl.py 文件源码

python
阅读 29 收藏 0 点赞 0 评论 0

项目:scrapy_sight 作者: wankaiss 项目源码 文件源码
def parse(self, response):
        for build in foreigh_7:
            item = SightItem()
            log.msg('build: ' + build, level=log.INFO)
            if baidu_geo_api(build.encode('utf-8')) is not None:
                lng, lat = baidu_geo_api(build.encode('utf-8'))
            else:
                lng, lat = 1, 1
            item['lng'] = lng
            item['lat'] = lat
            item['id_num'] = self.id_num
            self.id_num += 1L
            item['category'] = u'??????'
            item['title'] = build.encode('utf-8')
            pinyin = lazy_pinyin(build)
            item['pinyin'] = ''.join(pinyin).upper()
            if lng == 1 or lat == 1:
                log.msg('no landmark found: ' + 'at line 36,' + build, level=log.INFO)
                continue
            baike_url = 'https://baike.baidu.com/item/%s' % build
            yield scrapy.Request(baike_url, meta={'item': item}, callback=self.content_parse)
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号