bigrams.py 文件源码

python
阅读 39 收藏 0 点赞 0 评论 0

项目:textkit 作者: learntextvis 项目源码 文件源码
def words2bigrams(sep, tokens):
    '''Tokenize words into bigrams. Bigrams are two word tokens.
    Punctuation is considered as a separate token.'''

    content = read_tokens(tokens)
    bigrams = []
    try:
        bigrams = list(nltk.bigrams(content))
    except LookupError as err:
        click.echo(message="Error with tokenization", nl=True)
        click.echo(message="Have you run \"textkit download\"?", nl=True)
        click.echo(message="\nOriginal Error:", nl=True)
        click.echo(err)
    [output(sep.join(bigram)) for bigram in bigrams]
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号