Python

如何按多列过滤熊猫数据框

发布于 2021-01-29 19:35:02

要按单列过滤数据帧（df），如果我们考虑男性和女性的数据，则可以：

males = df[df[Gender]=='Male']

问题1-但是，如果数据跨越多年并且我只想看2014年的男性，该怎么办？

用其他语言，我可能会做类似的事情：

if A = "Male" and if B = "2014" then

（除了我要执行此操作，并在新的数据框对象中获取原始数据框的子集）

问题2。如何循环执行此操作，并为每个唯一的年份和性别集创建一个数据框对象（例如，2013-男，2013-女，2014-男和2014-女的df

for y in year:

for g in gender:

df = .....

关注者

被浏览

118

1 个回答

面试哥 2021-01-29

为面试而生，有面试问题，就找面试哥。

使用&运算符时，不要忘记将子语句包装为()：

males = df[(df[Gender]=='Male') & (df[Year]==2014)]

要将数据帧存储在dictfor循环中：

from collections import defaultdict
dic={}
for g in ['male', 'female']:
  dic[g]=defaultdict(dict)
  for y in [2013, 2014]:
    dic[g][y]=df[(df[Gender]==g) & (df[Year]==y)] #store the DataFrames to a dict of dict

编辑：

您的演示getDF：

def getDF(dic, gender, year):
  return dic[gender][year]

print genDF(dic, 'male', 2014)

知识点

Python

面圈网VIP题库全新上线，海量真题题库资源。 90大类考试，超10万份考试真题开放下载啦

去下载看看