counts2vocab.py 文件源码

python
阅读 21 收藏 0 点赞 0 评论 0

项目:histwords 作者: williamleif 项目源码 文件源码
def main():
    args = docopt("""
    Usage:
        counts2pmi.py <counts>
    """)

    counts_path = args['<counts>']

    words = Counter()
    contexts = Counter()
    with open(counts_path) as f:
        for line in f:
            count, word, context = line.strip().split()
            count = int(count)
            words[word] += count
            contexts[context] += count

    words = sorted(words.items(), key=lambda (x, y): y, reverse=True)
    contexts = sorted(contexts.items(), key=lambda (x, y): y, reverse=True)

    save_count_vocabulary(counts_path + '.words.vocab', words)
    save_count_vocabulary(counts_path + '.contexts.vocab', contexts)
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号