Python

使用Selenium Python ChromeDriver从弹出窗口/表单下载PDF

发布于 2021-01-29 15:01:40

无法确定下一步，尝试从网站下载pdf文件并被卡住。

“
https://www.southtechhosting.com/SanJoseCity/CampaignDocsWebRetrieval/Search/SearchByElection.aspx
”

带有PDF文件链接的页面

PDF文件下载

我可以使用Selenium和ChromeDriver从“带有链接的页面”中单击pdf链接，但是随后我得到了一个弹出表单而不是下载。

我尝试禁用Chrome PDF查看器（“ plugins.plugins_list”：[{“ enabled”：False，“ name”：“ Chrome
PDF Viewer”}]），但这无法正常工作。

弹出表单（在“要下载的PDF文件”中查看）具有一个悬停链接，用于下载pdf文件。我已经尝试过ActionChains（），但是在运行以下代码后却遇到了这个异常：

from selenium.webdriver.common.action_chains import ActionChains

element_to_hover = driver.find_element_by_xpath("//paper-icon-button[@id='download']")
hover = ActionChains(driver).move_to_element(element_to_hover)
hover.perform()

寻找在这种情况下下载pdf文件的最有效方法。谢谢！

关注者

被浏览

331

1 个回答

面试哥 2021-01-29

为面试而生，有面试问题，就找面试哥。

请尝试以下方法：

chromeOptions = webdriver.ChromeOptions()
prefs = {"plugins.always_open_pdf_externally": True}
chromeOptions.add_experimental_option("prefs",prefs)
driver = webdriver.Chrome(chrome_options=chromeOptions)
driver.get('https://www.southtechhosting.com/SanJoseCity/CampaignDocsWebRetrieval/Search/SearchByElection.aspx')

#Code to open the pop-up
driver.find_element_by_xpath('//*[@id="ctl00_DefaultContent_ASPxRoundPanel1_btnFindFilers_CD"]').click()
driver.find_element_by_xpath('//*[@id="ctl00_GridContent_gridFilers_DXCBtn0"]').click()
driver.find_element_by_xpath('//*[@id="ctl00_DefaultContent_gridFilingForms_DXCBtn0"]').click()

driver.switch_to.frame(driver.find_element_by_tag_name('iframe'))
a = driver.find_element_by_link_text("Click here")
ActionChains(driver).key_down(Keys.CONTROL).click(a).key_up(Keys.CONTROL).perform()

更新：要退出弹出窗口，可以尝试以下操作：

driver.switch_to.default_content()
driver.find_element_by_xpath('//*[@id="ctl00_GenericPopupSizeable_InnerPopupControl_HCB-1"]/img').click()

知识点

Python

面圈网VIP题库全新上线，海量真题题库资源。 90大类考试，超10万份考试真题开放下载啦

去下载看看