ul_policies.py 文件源码-python代码片段

ul_policies.py 文件源码

python

阅读 25 收藏 0 点赞 0 评论 0

项目：bolero 作者: rock-learning 项目源码文件源码

def __call__(self, context=None, explore=True):
        """Evaluates policy.

        Samples weight vector from distribution if explore is true, otherwise
        return the distribution's mean.

        Parameters
        ----------
        context : array-like, (n_context_dims,)
            context vector (ignored by this policy, defaults to None)

        explore : bool
            if true, weight vector is sampled from distribution. otherwise the
            distribution's mean is returned

        Returns
        -------
        parameter_vector: array-like, (n_weights,)
            the selected parameters
        """
        # Note: Context is ignored
        if not explore:
            return self.mean
        else:
            # Sample from normal distribution
            return self.random_state.multivariate_normal(
                mean=self.mean, cov=self.Sigma, size=1)[0]