main.py 文件源码

python
阅读 32 收藏 0 点赞 0 评论 0

项目:Spider_Hub 作者: WiseDoge 项目源码 文件源码
def crawl(url):
    """
    ????URL?????????????????
    """
    try:
        html = requests.get(url)
    except:
        with open("log.log","a") as file:
            file.write("Http error on " + time.ctime())
        time.sleep(60)
        return None
    soup = BeautifulSoup(html.text, 'lxml')
    data_list = []
    for cont in soup.find_all("div", {"class":"content"}):
        raw_data = cont.get_text()
        data = raw_data.replace("\n","")
        data_list.append(data)
    return data_list
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号