douban.py 文件源码

python
阅读 29 收藏 0 点赞 0 评论 0

项目:douban-movie 作者: chishui 项目源码 文件源码
def parse(movie):
    url = PAGE_URL % movie.id
    r = requests.get(url)
    soup = BeautifulSoup(r.text.encode('utf-8'), 'lxml')
    movie.score = soup.find('strong', 'rating_num').text
    info = soup.find('div', {'id': 'info'})
    for linebreak in info.find_all('br'):
        linebreak.extract()
    for span in info.contents:
        if isinstance(span, NavigableString): continue
        if span.contents[0]:
            if span.contents[0].string == u'??':
                if isinstance(span.contents[1], NavigableString):
                    movie.director = span.contents[2].text
            elif span.contents[0].string == u'??':
                if isinstance(span.contents[1], NavigableString):
                    movie.actor = span.contents[2].text
    print movie
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号