如何在python的字典中计算前10个最常见的值
我是python和程序设计的新手,所以请客气。我正在尝试使用音乐信息分析一个csv文件,并返回收听次数最多的前n个乐队。从下面的代码中,每首歌曲监听的都是列表中的dict条目,其格式如下:
[{'album': 'Exile on Main Street', 'song': 'Happy', 'datetime': '3 Dec 2014 14:08', 'artist': 'The Rolling Stones'}, {'album': 'II', 'song': 'Black Dog', 'datetime': '1 Dec 2014 08:08', 'artist': 'Led Zepplin'}]
from collections import Counter
def count_artist_plays(filename):
with open(filename, 'r') as data:
header = data.readline().strip().split(',')
entries = []
for line in data:
entry = line.strip().split(',')
listens = {}
for info, type in enumerate(header):
listens[type] = entry[info]
entries.append(listens)
for d in entries:
arts = d['artist']
c = Counter(arts)
print c.most_common(10)
如何获得最常见的字符串(带)而不是下面显示的字符分解?
[('s', 2), ('a', 1), (' ', 1), ('E', 1), ('l', 1), ('o', 1), ('n', 1), ('S', 1), ('v', 1), ('y', 1)]
-
初始化一次Counter,让 密钥 成为艺术家,并在整个循环中每次增加密钥(艺术家):
c = Counter() for d in entries: arts = d['artist'] c[arts] += 1 print(c.most_common(10))
当
arts
是一个字符串时,则c = Counter(arts)
计算以下字符arts
:In [522]: collections.Counter('Led Zepplin') Out[522]: Counter({'e': 2, 'p': 2, ' ': 1, 'd': 1, 'i': 1, 'L': 1, 'l': 1, 'n': 1, 'Z': 1})
相反:
In [523]: c = collections.Counter() In [524]: c['Led Zepplin'] += 1 In [525]: c['The Rolling Stones'] += 1 In [526]: c.most_common() Out[526]: [('Led Zepplin', 1), ('The Rolling Stones', 1)]
另外,正如乔恩·克莱门茨(Jon Clements)所指出的那样,建立所有艺术家的列表,然后对列表进行计数:
c = Counter(d['artist'] for d in entries) print(c.most_common(10))
请注意,以上代码使用生成器表达式来避免构建(可能是)大的临时列表,同时具有更简洁易读的语法。