demo_a3c_doom.py 文件源码

python

阅读 16 收藏 0 点赞 0 评论 0

项目：async-rl 作者: muupan 项目源码文件源码

def eval_single_run(env, model, phi, deterministic=False):
    model.reset_state()
    test_r = 0
    obs = env.reset()
    done = False
    while not done:
        s = chainer.Variable(np.expand_dims(phi(obs), 0))
        pout = model.pi_and_v(s)[0]
        model.unchain_backward()
        if deterministic:
            a = pout.most_probable_actions[0]
        else:
            a = pout.action_indices[0]
        obs, r, done, info = env.step(a)
        test_r += r
    return test_r

评论列表正在加载评论...

文章目录

提
问题

写
面经

写
文章

微信
公众号

扫码关注公众号