def tokenize_documents(documents):
for document in documents:
text = document.text
tokenized_doc = []
for sent in nltk.sent_tokenize(text):
tokenized_doc += nltk.word_tokenize(sent)
document.text = tokenized_doc
preprocess_data.py 文件源码
python
阅读 35
收藏 0
点赞 0
评论 0
评论列表
文章目录