熊猫使用多个字段一起过滤行
发布于 2021-01-29 15:03:56
我有这样的大熊猫DataFrame
:
In [34]: people = pandas.DataFrame({'name' : ['John', 'John', 'Mike', 'Sarah', 'Julie'], 'age' : [28, 18, 18, 2, 69]})
people = people[['name', 'age']]
people
Out[34]:
name age
0 John 28
1 John 18
2 Mike 18
3 Sarah 2
4 Julie 69
我想DataFrame
使用以下元组对此进行过滤:
In [35]: filter = [('John', 28), ('Mike', 18)]
输出应如下所示:
Out[35]:
name age
0 John 28
2 Mike 18
我尝试这样做:
In [34]: mask = k.isin({'name': ['John', 'Mike'], 'age': [28, 18]}).all(axis=1)
k = k[mask]
k
但是它向我展示了两个约翰,因为它独立地过滤了每一列(两个约翰的年龄都存在于age
数组中)。
Out[34]:
name age
0 John 28
1 John 18
2 Mike 18
如何根据多个字段合并过滤行?
关注者
0
被浏览
67
1 个回答
-
这应该工作:
people.set_index(people.columns.tolist(), drop=False).loc[filter].reset_index(drop=True)
清理并带解释
# set_index with the columns you want to reference in tuples cols = ['name', 'age'] people = people.set_index(cols, drop=False) # ^ # | # ensure the cols stay in dataframe # does what you # want but now has # index that was # not there # /--------------\ people.loc[filter].reset_index(drop=True) # \---------------------/ # Gets rid of that index