data_helpers.py 文件源码

python
阅读 30 收藏 0 点赞 0 评论 0

项目:CNN-Text-Pairs-Classification 作者: RandolphVI 项目源码 文件源码
def load_word2vec_matrix(vocab_size, embedding_size):
    """
    Return the word2vec model matrix.
    :param vocab_size: The vocab size of the word2vec model file
    :param embedding_size: The embedding size
    :return: The word2vec model matrix
    """
    word2vec_file = 'word2vec_' + str(embedding_size) + '.model'

    if os.path.isfile(word2vec_file):
        model = gensim.models.Word2Vec.load(word2vec_file)
        vocab = dict([(k, v.index) for k, v in model.wv.vocab.items()])
        vector = np.zeros([vocab_size, embedding_size])
        for key, value in vocab.items():
            if len(key) > 0:
                vector[value] = model[key]
        return vector
    else:
        logging.info("? The word2vec file doesn't exist. "
                     "Please use function <create_vocab_size(embedding_size)> to create it!")
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号