tensorflow实现残差网络方式(mnist数据集)

残差网络介绍

ResNet(Residual Network)是由微软研究院所提出的一种深度卷积神经网络结构,该网络的特别之处在于采用了“残差学习(residual learning)”的方法,解决了深层次神经网络的退化问题。

在普通的卷积神经网络中,我们每经过一个卷积层,其输出就在网络中被传递下去,直到到达最后的全连接层。但在残差网络中,每个卷积层的输出不仅仅和下一层的输入相连通,还和下一层的输出相连通,这种连接方式被称为“残差连接(residual connection)”。

ResNet实现MNIST数据集

在本文中,我们将使用TensorFlow实现一个基于ResNet的模型来对MNIST手写数字进行分类。

模型搭建

首先,我们需要定义一个残差模块函数:

import tensorflow as tf

def residual_block(input_layer, output_channel):

input_channel = input_layer.get_shape().as_list()[-1]

if input_channel * 2 == output_channel:

increase_dim = True

strides = [2, 2]

elif input_channel == output_channel:

increase_dim = False

strides = [1, 1]

else:

raise ValueError('Output and input channel does not match in residual blocks!!!')

conv1 = tf.layers.conv2d(input_layer, output_channel, [3, 3], strides=strides, padding='same',

activation=tf.nn.relu, name='conv1')

conv2 = tf.layers.conv2d(conv1, output_channel, [3, 3], strides=[1, 1], padding='same',

activation=tf.nn.relu, name='conv2')

if increase_dim:

pooled_input = tf.layers.average_pooling2d(input_layer, [2, 2], strides=[2, 2], padding='valid')

padded_input = tf.pad(pooled_input, [[0, 0], [0, 0], [0, 0], [input_channel // 2, input_channel // 2]])

else:

padded_input = input_layer

output = conv2 + padded_input

return output

然后,我们将其用于搭建模型,如下所示:

def res_net(x, num_blocks, num_classes):

conv0 = tf.layers.conv2d(x, 16, [3, 3], strides=[1, 1], padding='same', activation=tf.nn.relu, name='conv0')

res_input = conv0

for i in range(num_blocks):

with tf.variable_scope('block{}'.format(i)):

res_output = residual_block(res_input, 16)

res_input = res_output

bn = tf.layers.batch_normalization(res_output, axis=3)

relu = tf.nn.relu(bn)

out = tf.layers.average_pooling2d(relu, [7, 7], strides=[7, 7], padding='same')

flatten = tf.layers.flatten(out)

logits = tf.layers.dense(flatten, num_classes)

return logits

在这个模型中,我们先做一次$3\times3$卷积作为输入层,然后经过若干个残差块,最后通过全局平均池化、全连接层得到输出。

模型训练与评估

接下来,我们将使用mnist数据集来训练模型:

from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

learning_rate = 0.01

training_epochs = 10

batch_size = 128

display_step = 1

tf.reset_default_graph()

x = tf.placeholder(tf.float32, [None, 784])

y = tf.placeholder(tf.float32, [None, 10])

x_reshaped = tf.reshape(x, [-1, 28, 28, 1])

logits = res_net(x_reshaped, 1, 10)

cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=logits, labels=y))

optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cross_entropy)

correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(y, 1))

accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

init = tf.global_variables_initializer()

with tf.Session() as sess:

sess.run(init)

for epoch in range(training_epochs):

avg_cost = 0.

total_batch = int(mnist.train.num_examples / batch_size)

for i in range(total_batch):

batch_xs, batch_ys = mnist.train.next_batch(batch_size)

_, c = sess.run([optimizer, cross_entropy], feed_dict={x: batch_xs, y: batch_ys})

avg_cost += c / total_batch

if epoch % display_step == 0:

print("Epoch:", '%04d' % (epoch + 1), "cost=", \

"{:.9f}".format(avg_cost))

print("Optimization Finished!")

print("Testing Accuracy:", sess.run(accuracy, feed_dict={x: mnist.test.images, y: mnist.test.labels}))

在上面的代码中,我们对模型进行10个epoch的训练,使用Adam优化器进行梯度下降,以交叉熵为损失函数,最后计算测试集上的准确率作为模型的评估指标。

参数调整

为了提升模型性能,我们可以针对不同的参数进行调整。例如,可以调整模型深度、宽度、学习率等超参数:

learning_rate = 0.05

training_epochs = 20

batch_size = 256

tf.reset_default_graph()

x = tf.placeholder(tf.float32, [None, 784])

y = tf.placeholder(tf.float32, [None, 10])

x_reshaped = tf.reshape(x, [-1, 28, 28, 1])

logits = res_net(x_reshaped, 3, 10)

cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=logits, labels=y))

optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cross_entropy)

correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(y, 1))

accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

init = tf.global_variables_initializer()

with tf.Session() as sess:

sess.run(init)

for epoch in range(training_epochs):

avg_cost = 0.

total_batch = int(mnist.train.num_examples / batch_size)

for i in range(total_batch):

batch_xs, batch_ys = mnist.train.next_batch(batch_size)

_, c = sess.run([optimizer, cross_entropy], feed_dict={x: batch_xs, y: batch_ys})

avg_cost += c / total_batch

if epoch % display_step == 0:

print("Epoch:", '%04d' % (epoch + 1), "cost=", \

"{:.9f}".format(avg_cost))

print("Optimization Finished!")

print("Testing Accuracy:", sess.run(accuracy, feed_dict={x: mnist.test.images, y: mnist.test.labels}))

我们将学习率调整为0.05、训练epoch调整为20、batch_size调整为256,此时测试准确率相对于之前提高了约1%。

后端开发标签