topics_extraction.py 文件源码

python

阅读 26 收藏 0 点赞 0 评论 0

项目：my_topics 作者: GaelVaroquaux 项目源码文件源码

def build_analyzer(self):
        analyzer = super(TfidfVectorizer, self).build_analyzer()
        return lambda doc: (no_plural_stemmer(w) for w in analyzer(doc))

# We use a few heuristics to filter out useless terms early on: the posts
# are stripped of headers, footers and quoted replies, and common English
# words, words occurring in only one document or in at least 95% of the
# documents are removed.

# Use tf-idf features for NMF.

评论列表正在加载评论...

文章目录

提
问题

写
面经

写
文章

微信
公众号

扫码关注公众号