具有不同尺寸图像的Tensorflow卷积神经网络

发布于 2021-01-29 17:31:24

我正在尝试创建一个可以对图像中的每个像素进行分类的深层CNN。我复制从图像架构下面取自这个文件。在本文中提到使用反卷积,以便任何大小的输入都是可能的。可以在下图中看到。

目前,我已经对模型进行了硬编码,以接受尺寸为32x32x7的图像,但是我想接受任何尺寸的输入。 我需要对代码进行哪些更改以接受可变大小的输入?

 x = tf.placeholder(tf.float32, shape=[None, 32*32*7])
 y_ = tf.placeholder(tf.float32, shape=[None, 32*32*7, 3])
 ...
 DeConnv1 = tf.nn.conv3d_transpose(layer1, filter = w, output_shape = [1,32,32,7,1], strides = [1,2,2,2,1], padding = 'SAME')
 ...
 final = tf.reshape(final, [1, 32*32*7])
 W_final = weight_variable([32*32*7,32*32*7,3])
 b_final = bias_variable([32*32*7,3])
 final_conv = tf.tensordot(final, W_final, axes=[[1], [1]]) + b_final
关注者
0
被浏览
194
1 个回答
  • 面试哥
    面试哥 2021-01-29
    为面试而生,有面试问题,就找面试哥。

    动态占位符

    Tensorflow允许在占位符中具有 多个 动态(aka
    None)维度。生成图形时,引擎将无法确保正确性,因此客户端负责提供正确的输入,但是它提供了很大的灵活性。

    所以我要从…

    x = tf.placeholder(tf.float32, shape=[None, N*M*P])
    y_ = tf.placeholder(tf.float32, shape=[None, N*M*P, 3])
    ...
    x_image = tf.reshape(x, [-1, N, M, P, 1])
    

    至…

    # Nearly all dimensions are dynamic
    x_image = tf.placeholder(tf.float32, shape=[None, None, None, None, 1])
    label = tf.placeholder(tf.float32, shape=[None, None, 3])
    

    由于无论如何您打算将输入重塑为5D,所以为什么不x_image从一开始就立即使用5D 。在这一点上,的第二维label是任意的,但是我们 保证
    将与ten​​sorflow相匹配x_image

    去卷积中的动态形状

    接下来,有趣的tf.nn.conv3d_transpose是它的输出形状可以是动态的。所以代替这个:

    # Hard-coded output shape
    DeConnv1 = tf.nn.conv3d_transpose(layer1, w, output_shape=[1,32,32,7,1], ...)
    

    … 你可以这样做:

    # Dynamic output shape
    DeConnv1 = tf.nn.conv3d_transpose(layer1, w, output_shape=tf.shape(x_image), ...)
    

    这样,转置卷积可以应用于 任何 图像,并且结果将采用x_image实际在运行时传递的形状。

    请注意,静态形状x_image(?, ?, ?, ?, 1)

    全卷积网络

    难题的最后也是最重要的部分是使 整个网络 卷积,这也包括最后的密集层。密集层 必须 静态定义其尺寸,这会迫使整个神经网络确定输入图像的尺寸。

    对我们来说幸运的是,Springenberg等人在“力求简单:全卷积网络”一文中描述了一种用CONV层替换FC层的方法。我将使用带3个1x1x1过滤器的卷积(另请参见此问题):

    final_conv = conv3d_s1(final, weight_variable([1, 1, 1, 1, 3]))
    y = tf.reshape(final_conv, [-1, 3])
    

    如果我们确保finalDeConnv1(和其他尺寸)具有相同的尺寸,则可以y正确调整所需的形状:[-1, N * M * P, 3]

    结合在一起

    您的网络很大,但是所有反卷积基本上都遵循相同的模式,因此,我已将 概念验证
    代码简化为一个反卷积。目的只是表明哪种网络能够处理任意大小的图像。最后说明:批次 之间的 图像尺寸可能会有所不同,但在一批内,它们必须相同。

    完整代码:

    sess = tf.InteractiveSession()
    
    def conv3d_dilation(tempX, tempFilter):
      return tf.layers.conv3d(tempX, filters=tempFilter, kernel_size=[3, 3, 1], strides=1, padding='SAME', dilation_rate=2)
    
    def conv3d(tempX, tempW):
      return tf.nn.conv3d(tempX, tempW, strides=[1, 2, 2, 2, 1], padding='SAME')
    
    def conv3d_s1(tempX, tempW):
      return tf.nn.conv3d(tempX, tempW, strides=[1, 1, 1, 1, 1], padding='SAME')
    
    def weight_variable(shape):
      initial = tf.truncated_normal(shape, stddev=0.1)
      return tf.Variable(initial)
    
    def bias_variable(shape):
      initial = tf.constant(0.1, shape=shape)
      return tf.Variable(initial)
    
    def max_pool_3x3(x):
      return tf.nn.max_pool3d(x, ksize=[1, 3, 3, 3, 1], strides=[1, 2, 2, 2, 1], padding='SAME')
    
    x_image = tf.placeholder(tf.float32, shape=[None, None, None, None, 1])
    label = tf.placeholder(tf.float32, shape=[None, None, 3])
    
    W_conv1 = weight_variable([3, 3, 1, 1, 32])
    h_conv1 = conv3d(x_image, W_conv1)
    # second convolution
    W_conv2 = weight_variable([3, 3, 4, 32, 64])
    h_conv2 = conv3d_s1(h_conv1, W_conv2)
    # third convolution path 1
    W_conv3_A = weight_variable([1, 1, 1, 64, 64])
    h_conv3_A = conv3d_s1(h_conv2, W_conv3_A)
    # third convolution path 2
    W_conv3_B = weight_variable([1, 1, 1, 64, 64])
    h_conv3_B = conv3d_s1(h_conv2, W_conv3_B)
    # fourth convolution path 1
    W_conv4_A = weight_variable([3, 3, 1, 64, 96])
    h_conv4_A = conv3d_s1(h_conv3_A, W_conv4_A)
    # fourth convolution path 2
    W_conv4_B = weight_variable([1, 7, 1, 64, 64])
    h_conv4_B = conv3d_s1(h_conv3_B, W_conv4_B)
    # fifth convolution path 2
    W_conv5_B = weight_variable([1, 7, 1, 64, 64])
    h_conv5_B = conv3d_s1(h_conv4_B, W_conv5_B)
    # sixth convolution path 2
    W_conv6_B = weight_variable([3, 3, 1, 64, 96])
    h_conv6_B = conv3d_s1(h_conv5_B, W_conv6_B)
    # concatenation
    layer1 = tf.concat([h_conv4_A, h_conv6_B], 4)
    w = tf.Variable(tf.constant(1., shape=[2, 2, 4, 1, 192]))
    DeConnv1 = tf.nn.conv3d_transpose(layer1, filter=w, output_shape=tf.shape(x_image), strides=[1, 2, 2, 2, 1], padding='SAME')
    
    final = DeConnv1
    final_conv = conv3d_s1(final, weight_variable([1, 1, 1, 1, 3]))
    y = tf.reshape(final_conv, [-1, 3])
    cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=label, logits=y))
    
    print('x_image:', x_image)
    print('DeConnv1:', DeConnv1)
    print('final_conv:', final_conv)
    
    def try_image(N, M, P, B=1):
      batch_x = np.random.normal(size=[B, N, M, P, 1])
      batch_y = np.ones([B, N * M * P, 3]) / 3.0
    
      deconv_val, final_conv_val, loss = sess.run([DeConnv1, final_conv, cross_entropy],
                                                  feed_dict={x_image: batch_x, label: batch_y})
      print(deconv_val.shape)
      print(final_conv.shape)
      print(loss)
      print()
    
    tf.global_variables_initializer().run()
    try_image(32, 32, 7)
    try_image(16, 16, 3)
    try_image(16, 16, 3, 2)
    


知识点
面圈网VIP题库

面圈网VIP题库全新上线,海量真题题库资源。 90大类考试,超10万份考试真题开放下载啦

去下载看看