def process_line(line):
tokens = word_tokenize(line)
output_tokens = []
for token in tokens:
if token in INS_PUNCTS:
output_tokens.append(INS_PUNCTS[token])
elif token in EOS_PUNCTS:
output_tokens.append(EOS_PUNCTS[token])
elif is_number(token):
output_tokens.append(NUM)
else:
output_tokens.append(token.lower())
return untokenize(" ".join(output_tokens) + " ")
dont_run_me_run_the_other_script_instead.py 文件源码
python
阅读 24
收藏 0
点赞 0
评论 0
评论列表
文章目录