movie_sentiment.py 文件源码

python
阅读 17 收藏 0 点赞 0 评论 0

项目:StoryArcs 作者: dfmcaleer 项目源码 文件源码
def text_clean(filename):
    '''
    Input: File path of script.
    Output: List of all words in script lowercased, lemmatized, without punctuation.
    '''
    wnl = WordNetLemmatizer()
    word_list = [word.decode("utf8", errors='ignore') for line in open(filename, 'r') for word in line.split()]
    lemma_list = [wnl.lemmatize(word.lower()) for word in word_list]
    return lemma_list
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号