normalize.py 文件源码

python

阅读 40 收藏 0 点赞 0 评论 0

项目：minke 作者: DistrictDataLabs 项目源码文件源码

def normalize(self, words):
        """
        Normalizes a list of words.
        """
        # Add part of speech tags to the words
        words = nltk.pos_tag(words)

        for word, tag in words:
            if self.lower: word = word.lower()
            if self.strip: word = word.strip()

            if word not in self.stopwords:
                if not all(c in self.punct for c in word):
                    if self.lemmatize:
                        word = self.lemmatizer.lemmatize(word, tag)

                    yield word

评论列表正在加载评论...

文章目录

提
问题

写
面经

写
文章

微信
公众号

扫码关注公众号