navigator.py 文件源码

python
阅读 28 收藏 0 点赞 0 评论 0

项目:mazerunner 作者: lucasdavid 项目源码 文件源码
def reward(self, a, s1):
        """Immediate Reward Function."""
        reward = 0
        s0, s1 = self.data, s1.data

        # rewards related to states
        if any(proximity < ProximitySensor.COLLISION_THRESHOLD
               for proximity in s0[1:]):
            reward += self.IMMEDIATE_REWARD['collision']

        reward += (np.sign(s0[0] - s1[0]) *
                   self.IMMEDIATE_REWARD['position-delta'])

        if s1[0] < s0[0]:
            reward_proximity = (self.IMMEDIATE_REWARD['close-to-goal'] *
                                (1 - self.data[0] / 28))
            reward += reward_proximity
            logger.info('distance: %.2f, reward-proximity: %.2f',
                        s0[0], reward_proximity)

        # rewards related to actions.
        reward += self.IMMEDIATE_REWARD[a]

        logger.info('reward: %.2f', reward)
        return reward
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号