words_sim.py 文件源码

python
阅读 25 收藏 0 点赞 0 评论 0

项目:WordEmbedding 作者: ziliwang 项目源码 文件源码
def computeDistMatrices(emb1, emb2, gold):
    correct = 0
    MRR = float(0.0)
    vob2 = []
    matrix2 = []
    for i in emb2.keys():
        vob2.append(i)
        matrix2.append(emb2[i])
    matrix2 = np.array(matrix2)

    kdtree = KDTree(matrix2, leafsize=100)
    for gold_en, gold_trans in gold.items():
        d, index = kdtree.query(emb1[gold_en], k=10)
        for i in index:
            if vob2[i] in gold_trans:
                correct += 1
                MRR += float(1/(i+1))
                break

    print('\nfinished!\n')
    print('{}/{} % age {}'.format(correct, len(gold.keys()),
          float(correct/len(gold.keys()))))
    print('{}/{} MRR {}'.format(MRR, len(gold.keys()),
          float(MRR/len(gold.keys()))))
    print("dist matrix done!")
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号