asa.py 文件源码

python
阅读 27 收藏 0 点赞 0 评论 0

项目:ar-embeddings 作者: iamaziz 项目源码 文件源码
def tokenize(text):
        """
        :param text: a paragraph string
        :return: a list of words
        """

        try:
            try:
                txt = unicode(text, 'utf-8')  # py2
            except NameError:
                txt = text  # py3
            words = wordpunct_tokenize(txt)
            length = len(words)
        except TypeError:
            words, length = ['NA'], 0

        return words, length
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号