ocr.py 文件源码

python
阅读 38 收藏 0 点赞 0 评论 0

项目:tf-cnn-lstm-ocr-captcha 作者: Luonic 项目源码 文件源码
def get_lstm_layers(features, timesteps, batch_size):
    with tf.variable_scope('RNN'):
      # Has size [batch_size, max_stepsize, num_features], but the
      # batch_size and max_stepsize can vary along each step
      #tf.placeholder(tf.float32, [None, None, ocr_input.IMAGE_HEIGHT])
      inputs = features
      shape = tf.shape(features)
      batch_size, max_timesteps = shape[0], shape[1]      

      # Defining the cell
      # Can be:
      #   tf.nn.rnn_cell.RNNCell
      #   tf.nn.rnn_cell.GRUCell
      cell = tf.contrib.rnn.core_rnn_cell.LSTMCell(LSTM_HIDDEN_SIZE, state_is_tuple=True)

      # Stacking rnn cells
      stack = tf.contrib.rnn.core_rnn_cell.MultiRNNCell([cell] * NUM_LSTM_LAYERS,
                                          state_is_tuple=True)

      # The second output is the last state and we will no use that
      outputs, _ = tf.nn.dynamic_rnn(stack, inputs, timesteps, dtype=tf.float32)          

      # Reshaping to apply the same weights over the timesteps
      outputs = tf.reshape(outputs, [-1, LSTM_HIDDEN_SIZE])
      # outputs = tf.Print(outputs, [outputs], "Outputs")

      with tf.variable_scope('logits'):
        w = tf.Variable(tf.truncated_normal([LSTM_HIDDEN_SIZE,
                                           NUM_CLASSES],
                                          stddev=0.1), name="w")
        b = tf.Variable(tf.constant(0., shape=[NUM_CLASSES]), name="b")

        # Doing the affine projection
        logits = tf.matmul(outputs, w) + b

        # Reshaping back to the original shape
        logits = tf.reshape(logits, [batch_size, -1, NUM_CLASSES])

        logits = tf.transpose(logits, [1, 0, 2], name="out_logits")

      return logits
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号