techs.py 文件源码

python
阅读 19 收藏 0 点赞 0 评论 0

项目:remotor 作者: jamiebull1 项目源码 文件源码
def get_tech(text):
    """Get all technologies from the top 1000 tags on StackOverflow.
    """
    sentences = sent_tokenize(text)
    techs = set()
    for s in sentences:
        tokens = word_tokenize(s)
        techs |= set(tag for tag in tags if tag in tokens)
        bigrams = ['-'.join(ngram) for ngram in ngrams(tokens, 2)]
        techs |= set(tag for tag in tags if tag in bigrams)
        trigrams = ['-'.join(ngram) for ngram in ngrams(tokens, 3)]
        techs |= set(tag for tag in tags if tag in trigrams)
    return list(techs)
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号