utils_pg.py 文件源码-python代码片段

utils_pg.py 文件源码

python

阅读 39 收藏 0 点赞 0 评论 0

项目：rl_algorithms 作者: DanielTakeshi 项目源码文件源码

def gauss_log_prob(mu, logstd, x):
    """ Used for computing the log probability, following the formula for the
    multivariate Gaussian density. 

    All the inputs should have shape (n,a). The `gp_na` contains component-wise
    probabilitiles, then the reduce_sum results in a tensor of size (n,) which
    contains the log probability for each of the n elements. (We later perform a
    mean on this.) Also, the 2*pi part needs 1/2, but doesn't need the sum over
    the number of components (# of actions) because of the reduce sum here.
    Finally, logstd doesn't need a 1/2 constant because log(\sigma_i^2) will
    bring the 2 over. 

    This formula generalizes for an arbitrary number of actions, BUT it assumes
    that the covariance matrix is diagonal.
    """
    var_na = tf.exp(2*logstd)
    gp_na = -tf.square(x - mu)/(2*var_na) - 0.5*tf.log(tf.constant(2*np.pi)) - logstd
    return tf.reduce_sum(gp_na, axis=[1])