Python

使用BeautifulSoup搜索Yahoo Finance

发布于 2021-01-29 14:09:45

我正在尝试从“关键统计信息”页面中获取有关Yahoo中的代码的信息（因为Pandas库中不支持此功能）。

AAPL示例：

from bs4 import BeautifulSoup
import requests

url = 'http://finance.yahoo.com/quote/AAPL/key-statistics?p=AAPL'
page = requests.get(url)
soup = BeautifulSoup(page.text, 'lxml')

enterpriseValue = soup.findAll('$ENTERPRISE_VALUE', attrs={'class': 'yfnc_tablehead1'}) #HTML tag for where enterprise value is located

print(enterpriseValue)

编辑：谢谢安迪！

问题：这正在打印一个空数组。如何更改findAll退货598.56B？

关注者

被浏览

192

1 个回答

面试哥 2021-01-29

为面试而生，有面试问题，就找面试哥。
好吧，find_all返回的列表为空的原因是因为该数据是通过单独的调用生成的，仅通过GET向该URL发送请求就无法完成。如果浏览Chrome /
Firefox上的“网络”标签并按XHR进行过滤，则通过检查每个网络操作的请求和响应，您还可以找到应该发送GET请求的URL 。

在这种情况下，它是https://query2.finance.yahoo.com/v10/finance/quoteSummary/AAPL?formatted=true&crumb=8ldhetOu7RJ&lang=en- US&region=US&modules=defaultKeyStatistics%2CfinancialData%2CcalendarEvents&corsDomain=finance.yahoo.com，我们可以在这里看到：

那么，我们如何重新创建它呢？简单！：
```
from bs4 import BeautifulSoup
import requests

r = requests.get('https://query2.finance.yahoo.com/v10/finance/quoteSummary/AAPL?formatted=true&crumb=8ldhetOu7RJ&lang=en-US&region=US&modules=defaultKeyStatistics%2CfinancialData%2CcalendarEvents&corsDomain=finance.yahoo.com')
data = r.json()
```
这将以形式返回JSON响应dict。在此处浏览，dict直到找到需要的数据：
```
financial_data = data['quoteSummary']['result'][0]['defaultKeyStatistics']
enterprise_value_dict = financial_data['enterpriseValue']
print(enterprise_value_dict)
>>> {'fmt': '598.56B', 'raw': 598563094528, 'longFmt': '598,563,094,528'}
print(enterprise_value_dict['fmt'])
>>> '598.56B'
```

知识点

Python

面圈网VIP题库全新上线，海量真题题库资源。 90大类考试，超10万份考试真题开放下载啦

去下载看看