Python

如何使用Selenium / Python获取由JavaScript编写的html内容[重复]

发布于 2021-01-29 14:10:10

这个问题已经在这里有了答案 ：

使用Python在Selenium
WebDriver中获取WebElement的HTML源代码
（15个答案）

6年前关闭。

我正在使用Selenium进行网络爬网，我想在Selenium模拟点击假链接后获得由JavaScript编写的元素（例如链接）。

我尝试了get_html_source（），但其中不包含JavaScript编写的内容。

我写的代码：

    def test_comment_url_fetch(self):
        sel = self.selenium 
        sel.open("/rmrb")
        url = sel.get_location()
        #print url
        if url.startswith('http://login'):
            sel.open("/rmrb")
        i = 1
        while True:
            try:
                if i == 1:
                    sel.click("//div[@class='WB_feed_type SW_fun S_line2']/div/div/div[3]/div/a[4]") 
                    print "click"
                else:
                    XPath = "//div[@class='WB_feed_type SW_fun S_line2'][%d]/div/div/div[3]/div/a[4]"%i
                    sel.click(XPath)
                    print "click"
            except Exception, e:
                print e
                break
            i += 1
        html = sel.get_html_source()
        html_file = open("tmp\\foo.html", 'w')
        html_file.write(html.encode('utf-8'))
        html_file.close()

我使用while循环单击一系列伪造的链接，这些伪造的链接触发js动作以显示额外的内容，而该内容正是我想要的。但是sel.get_html_source（）没有提供我想要的东西。

有人可以帮忙吗？非常感谢。

关注者

被浏览

189

1 个回答

面试哥 2021-01-29

为面试而生，有面试问题，就找面试哥。
由于我通常在获取的节点上进行后处理，因此我直接在浏览器中使用运行JavaScript execute_script。例如，获取所有a-tag：
```
js_code = "return document.getElementsByTagName('a')"
your_elements = sel.execute_script(js_code)
```
编辑：execute_script和get_eval是等效的，除了get_eval执行隐式返回，execute_script必须明确声明它。

知识点

Python

面圈网VIP题库全新上线，海量真题题库资源。 90大类考试，超10万份考试真题开放下载啦

去下载看看