为什么itertools.groupby()不起作用?
我检查了一些有关的主题,groupby()
但我没有发现我的示例出了什么问题:
students = [{'name': 'Paul', 'mail': '@gmail.com'},
{'name': 'Tom', 'mail': '@yahoo.com'},
{'name': 'Jim', 'mail': 'gmail.com'},
{'name': 'Jules', 'mail': '@something.com'},
{'name': 'Gregory', 'mail': '@gmail.com'},
{'name': 'Kathrin', 'mail': '@something.com'}]
key_func = lambda student: student['mail']
for key, group in itertools.groupby(students, key=key_func):
print(key)
print(list(group))
这将分别打印每个学生。为什么我没有只得到3组@gmail.com
,@yahoo.com
和@something.com
?
-
对于初学者来说,有些邮件是
gmail.com
,有些则@gmail.com
是为什么将它们视为单独的组。groupby
还希望通过相同的key
功能对数据进行预排序,这说明了为什么会得到@something.com
两次的原因。从文档:
…通常,可迭代项需要已经在相同的键函数上进行了排序。…
students = [{'name': 'Paul', 'mail': '@gmail.com'}, {'name': 'Tom', 'mail': '@yahoo.com'}, {'name': 'Jim', 'mail': 'gmail.com'}, {'name': 'Jules', 'mail': '@something.com'}, {'name': 'Gregory', 'mail': '@gmail.com'}, {'name': 'Kathrin', 'mail': '@something.com'}] key_func = lambda student: student['mail'] students.sort(key=key_func) # sorting by same key function we later use with groupby for key, group in itertools.groupby(students, key=key_func): print(key) print(list(group)) # @gmail.com # [{'name': 'Paul', 'mail': '@gmail.com'}, {'name': 'Gregory', 'mail': '@gmail.com'}] # @something.com # [{'name': 'Jules', 'mail': '@something.com'}, {'name': 'Kathrin', 'mail': '@something.com'}] # @yahoo.com # [{'name': 'Tom', 'mail': '@yahoo.com'}] # gmail.com # [{'name': 'Jim', 'mail': 'gmail.com'}]
固定排序和
gmail.com
/之后,@gmail.com
我们得到了预期的输出:import itertools students = [{'name': 'Paul', 'mail': '@gmail.com'}, {'name': 'Tom', 'mail': '@yahoo.com'}, {'name': 'Jim', 'mail': '@gmail.com'}, {'name': 'Jules', 'mail': '@something.com'}, {'name': 'Gregory', 'mail': '@gmail.com'}, {'name': 'Kathrin', 'mail': '@something.com'}] key_func = lambda student: student['mail'] students.sort(key=key_func) for key, group in itertools.groupby(students, key=key_func): print(key) print(list(group)) # @gmail.com # [{'mail': '@gmail.com', 'name': 'Paul'}, # {'mail': '@gmail.com', 'name': 'Jim'}, # {'mail': '@gmail.com', 'name': 'Gregory'}] # @something.com # [{'mail': '@something.com', 'name': 'Jules'}, # {'mail': '@something.com', 'name': 'Kathrin'}] # @yahoo.com # [{'mail': '@yahoo.com', 'name': 'Tom'}]