critic_network.py 文件源码

python
阅读 36 收藏 0 点赞 0 评论 0

项目:-NIPS-2017-Learning-to-Run 作者: kyleliang919 项目源码 文件源码
def create_q_network(self,state_dim,action_dim,scope):
    with tf.variable_scope(scope):
        # the layer size could be changed
            layer1_size = LAYER1_SIZE
            layer2_size = LAYER2_SIZE

            state_input = tf.placeholder("float",[None,state_dim])
            action_input = tf.placeholder("float",[None,action_dim])

            W1 = self.variable([state_dim,layer1_size],state_dim)
            b1 = self.variable([layer1_size],state_dim)
            W2 = self.variable([layer1_size,layer2_size],layer1_size+action_dim)
            W2_action = self.variable([action_dim,layer2_size],layer1_size+action_dim)
            b2 = self.variable([layer2_size],layer1_size+action_dim)
            W3 = tf.Variable(tf.random_uniform([layer2_size,1],-3e-3,3e-3))
            b3 = tf.Variable(tf.random_uniform([1],-3e-3,3e-3))

            layer1 = tf.nn.relu(tf.matmul(state_input,W1) + b1)
            layer2 = tf.nn.relu(tf.matmul(layer1,W2) + tf.matmul(action_input,W2_action) + b2)
            q_value_output = tf.identity(tf.matmul(layer2,W3) + b3)

            return state_input,action_input,q_value_output,[W1,b1,W2,W2_action,b2,W3,b3]
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号