entropy.py 文件源码

python
阅读 28 收藏 0 点赞 0 评论 0

项目:cluster_paraphrases 作者: acocos 项目源码 文件源码
def shannon_entropy(freq, unit='bit'):
    """Calculates the Shannon Entropy (H) of a frequency.

    Arguments:

        - freq (``numpy.ndarray``) A ``Freq`` instance or ``numpy.ndarray`` with 
          frequency vectors along the last axis.
        - unit (``str``) The unit of the returned entropy one of 'bit', 'digit' 
          or 'nat'.
    """
    log = get_base(unit)
    shape = freq.shape # keep shape to return in right shape
    Hs = np.ndarray(freq.size / shape[-1]) # place to keep entropies
    # this returns an array of vectors or just a vector of frequencies
    freq = freq.reshape((-1, shape[-1])) 
    # this makes sure we have an array of vectors of frequencies
    freq = np.atleast_2d(freq)
    # get fancy indexing
    positives = freq != 0.
    for i, (freq, idx) in enumerate(izip(freq, positives)):
        freq = freq[idx] # keep only non-zero
        logs = log(freq) # logarithms of non-zero frequencies
        Hs[i] = -np.sum(freq * logs)
    Hs.reshape(shape[:-1])
    return Hs
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号