utils_pg.py 文件源码

python
阅读 39 收藏 0 点赞 0 评论 0

项目:rl_algorithms 作者: DanielTakeshi 项目源码 文件源码
def gauss_log_prob(mu, logstd, x):
    """ Used for computing the log probability, following the formula for the
    multivariate Gaussian density. 

    All the inputs should have shape (n,a). The `gp_na` contains component-wise
    probabilitiles, then the reduce_sum results in a tensor of size (n,) which
    contains the log probability for each of the n elements. (We later perform a
    mean on this.) Also, the 2*pi part needs 1/2, but doesn't need the sum over
    the number of components (# of actions) because of the reduce sum here.
    Finally, logstd doesn't need a 1/2 constant because log(\sigma_i^2) will
    bring the 2 over. 

    This formula generalizes for an arbitrary number of actions, BUT it assumes
    that the covariance matrix is diagonal.
    """
    var_na = tf.exp(2*logstd)
    gp_na = -tf.square(x - mu)/(2*var_na) - 0.5*tf.log(tf.constant(2*np.pi)) - logstd
    return tf.reduce_sum(gp_na, axis=[1])
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号