phone_decoder.py 文件源码

python
阅读 22 收藏 0 点赞 0 评论 0

项目:make_dataset 作者: hyzhan 项目源码 文件源码
def wer(self, s1, s2):
        """
        Computes the Word Error Rate, defined as the edit distance between the
        two provided sentences after tokenizing to words.
        Arguments:
            s1 (string): space-separated sentence
            s2 (string): space-separated sentence
        """

        # build mapping of words to integers
        s1 = s1.replace(' ','')
        s2 = s2.replace(' ','')
        b = set(s1.split('<space>') + s2.split('<space>'))
        word2char = dict(zip(b, range(len(b))))

        # map the words to a char array (Levenshtein packages only accepts
        # strings)
        w1 = [chr(word2char[w]) for w in s1.split()]
        w2 = [chr(word2char[w]) for w in s2.split()]

        return Lev.distance(''.join(w1), ''.join(w2))
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号