Python

在python中计算唯一单词

发布于 2021-01-29 14:10:39

到目前为止，我的代码是这样的：

from glob import glob
pattern = "D:\\report\\shakeall\\*.txt"
filelist = glob(pattern)
def countwords(fp):
    with open(fp) as fh:
        return len(fh.read().split())
print "There are" ,sum(map(countwords, filelist)), "words in the files. " "From directory",pattern

我想添加一个代码，该代码可以计算来自模式（此路径中的42个txt文件）中的唯一单词，但我不知道该怎么做。有谁能够帮助我？

关注者

被浏览

139

1 个回答

面试哥 2021-01-29

为面试而生，有面试问题，就找面试哥。
在Python中计算对象的最佳方法是使用collections.Counter为此目的而创建的类。它的行为类似于Python字典，但计数时使用起来稍微容易一些。您只需传递对象列表，它就会自动为您计数。
```
>>> from collections import Counter
>>> c = Counter(['hello', 'hello', 1])
>>> print c
Counter({'hello': 2, 1: 1})
```
Counter也有一些有用的方法，例如most_common，请访问文档以了解更多信息。

Counter类也可能非常有用的一种方法是update方法。通过传递对象列表实例化Counter后，可以使用update方法执行相同的操作，它将继续计数而不会删除对象的旧计数器：
```
>>> from collections import Counter
>>> c = Counter(['hello', 'hello', 1])
>>> print c
Counter({'hello': 2, 1: 1})
>>> c.update(['hello'])
>>> print c
Counter({'hello': 3, 1: 1})
```

知识点

Python

面圈网VIP题库全新上线，海量真题题库资源。 90大类考试，超10万份考试真题开放下载啦

去下载看看