ul_policies.py 文件源码-python代码片段

ul_policies.py 文件源码

python

阅读 35 收藏 0 点赞 0 评论 0

项目：bolero 作者: rock-learning 项目源码文件源码

def __call__(self, context, explore=True):
        """Evaluates policy for given context.

        Samples weight vector from distribution if explore is true, otherwise
        return the distribution's mean (which depends on the context).

        Parameters
        ----------
        context: array-like, (n_context_dims,)
            context vector

        explore: bool
            if true, weight vector is sampled from distribution. otherwise the
            distribution's mean is returned
        """
        if explore:
            return self.random_state.multivariate_normal(
                self.W.dot(context), self.Sigma, size=[1])[0]
        else:
            return self.W.dot(context)