movie_dataset.py 文件源码

python

阅读 37 收藏 0 点赞 0 评论 0

项目：CopyNet 作者: MultiPath 项目源码文件源码

def mark(line):
    tmp_line = ''
    for c in line:
        if c in string.punctuation:
            if c is not "'":
                tmp_line += ' ' + c + ' '
            else:
                tmp_line += ' ' + c
        else:
            tmp_line += c
    tmp_line = tmp_line.lower()
    words = [w for w in tmp_line.split() if len(w) > 0]
    for w in words:
        if w not in word2freq:
            word2freq[w]  = 1
        else:
            word2freq[w] += 1
    return words

评论列表正在加载评论...

文章目录

提
问题

写
面经

写
文章

微信
公众号

扫码关注公众号