preprocess.py 文件源码

python
阅读 44 收藏 0 点赞 0 评论 0

项目:marseille 作者: vene 项目源码 文件源码
def optimize_glove(glove_path, vocab):
    """Trim down GloVe embeddings to use only words in the data."""
    vocab_set = frozenset(vocab)
    seen_vocab = []
    X = []
    with open(glove_path) as f:
        for line in f:
            line = line.strip().split(' ')  # split() fails on ". . ."
            word, embed = line[0], line[1:]
            if word in vocab_set:
                X.append(np.array(embed, dtype=np.float32))
                seen_vocab.append(word)
    return seen_vocab, np.row_stack(X)
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号