experiments.py 文件源码

python

阅读 53 收藏 0 点赞 0 评论 0

项目：clickbait 作者: bhargaviparanjape 项目源码文件源码

def n_gram_analysis_simple(infile, gram, stop):
    ngram = dict()
    f = open(infile, "r" )
    #f2 = codecs.open(outfile, "w+", "utf-8")
    for l in f:
        x = nltk.ngrams(l.split(),gram)
        for w in x:
            # if stop:
            #   if w not in stops:
               #      if w in ngram:
               #          ngram[w]+=1
               #      else:
               #        ngram[w]=1
            if w in ngram:
                ngram[w] += 1
            else:
                ngram[w] = 1
    p = list(ngram.items())
    p.sort(key = lambda x: -x[1])
    print len(p)
    for x in p[:10]:
        sen = ' '.join(x[0])
        cnt = int(x[1])
        if cnt == 0:
            cnt = 1
        print sen, cnt

评论列表正在加载评论...

文章目录

提
问题

写
面经

写
文章

微信
公众号

扫码关注公众号