一、ShuffleNet介绍
ShuffleNet是一种轻量级的卷积神经网络结构,具有低延迟和高准确率的优点。相比于传统的网络结构,它采用了一种新的模块化思想——Channel Shuffle,可以减少参数和计算量,同时提高准确率。
Channel Shuffle的思想就是把通道进行混洗,图片中不同的通道可以组合起来变成新的通道,从而减少参数量和计算量。在ShuffleNet中,使用了两种模块化结构:shuffle unit和 point-wise group convolution。
1. Shuffle Unit
Shuffle Unit是ShuffleNet最核心的模块,它也是通过Channel Shuffle的方式来减少参数量和计算量。Shuffle Unit的结构包括三个部分:
分组卷积
通道混洗
点积卷积
其中分组卷积和点积卷积的作用和传统的卷积神经网络一样,不同的是通道混洗的步骤,通道混洗的实现分为两步:第一步是把不同的通道分成几个小组,同时每组中的通道数目相等;第二步就是把每个小组中的数据进行混洗后再合并到一起。
2. Point-wise Group Convolution
Point-wise Group Convolution是指采用1x1卷积来提取特征,同时把1x1卷积分成几个小组进行计算,从而减少参数量和计算量。
二、Keras实现ShuffleNet
在Keras中实现ShuffleNet的网络结构,需要借助于TensorFlow backend。下面将给出一份ShuffleNet的代码实现。
from keras import backend as K
from keras.layers import Input, Conv2D, Concatenate, DepthwiseConv2D, BatchNormalization, ZeroPadding2D, Add, ReLU, GlobalAveragePooling2D
from keras.models import Model
def channel_shuffle(x, groups):
height, width, in_channels = x.shape.as_list()[1:]
channels_per_group = in_channels // groups
# reshape
x = K.reshape(x, [-1, height, width, groups, channels_per_group])
# transpose。将x变成(?, height, width, channels_per_group, groups)的形状。
x = K.permute_dimensions(x, (0, 1, 2, 4, 3))
# reshape。将x变成(?, height, width, in_channels)的形状。
x = K.reshape(x, [-1, height, width, in_channels])
return x
def shuffle_unit(inputs, out_channels, strides=2, bottleneck_ratio=1, groups=1, stage=1, block=1):
prefix = f'stage{stage}_block{block}_'
bottleneck_channels = int(out_channels * bottleneck_ratio)
if K.image_data_format() == 'channels_last': # tf
bn_axis = 3
else:
bn_axis = 1
# 分组卷积和通道混洗。就是分为group组,然后把每组的通道数平均分给每个数据。
x = Conv2D(bottleneck_channels, (1, 1), strides=1, padding='same',
use_bias=False, name=prefix + 'gconv1')(inputs)
x = BatchNormalization(axis=bn_axis, name=prefix + 'bn1')(x)
x = ReLU(name=prefix + 'relu1')(x)
x = channel_shuffle(x, groups)
# 深度卷积和点积卷积。深度卷积(DepthwiseConv2D)和点积卷积(Conv2D with 1x1 filter)。
x = DepthwiseConv2D((3, 3), strides=strides, padding='same', use_bias=False,
name=prefix + 'dwconv')(x)
x = BatchNormalization(axis=bn_axis, name=prefix + 'bn2')(x)
x = Conv2D(out_channels, (1, 1), strides=1, padding='same',
use_bias=False, name=prefix + 'gconv2')(x)
x = BatchNormalization(axis=bn_axis, name=prefix + 'bn3')(x)
x = ReLU(name=prefix + 'relu2')(x)
if strides == 2:
inputs = ZeroPadding2D(padding=((0, 1), (0, 1)), name=prefix + 'zeropad')(inputs)
input_channels = inputs.shape.as_list()[bn_axis]
if strides == 2 and input_channels != out_channels:
inputs = Conv2D(out_channels, (1, 1), strides=1, padding='same',
use_bias=False, name=prefix + 'skip_conv')(inputs)
inputs = BatchNormalization(axis=bn_axis, name=prefix + 'skip_bn')(inputs)
out = Add(name=prefix + 'add')([x, inputs])
return out
def shuffle_netV2(input_shape=(224, 224, 3), scale_factor=1.0, num_classes=1000):
inputs = Input(shape=input_shape)
if K.image_data_format() == 'channels_last': # tf
bn_axis = 3
else:
bn_axis = 1
first_filters = int(24 * scale_factor)
x = Conv2D(first_filters, (3, 3), strides=2, padding='same',
use_bias=False, name='conv1')(inputs)
x = BatchNormalization(axis=bn_axis, name='bn1')(x)
x = ReLU(name='relu1')(x)
x = MaxPooling2D(pool_size=3, strides=2, padding='same', name='maxpool1')(x)
x = shuffle_unit(x, out_channels=int(48 * scale_factor), strides=2, groups=3, stage=2, block=1)
x = shuffle_unit(x, out_channels=int(96 * scale_factor), strides=2, groups=3, stage=2, block=2)
x = shuffle_unit(x, out_channels=int(192 * scale_factor), strides=2, groups=3, stage=2, block=3)
x = shuffle_unit(x, out_channels=int(384 * scale_factor), strides=2, groups=3, stage=3, block=1)
x = shuffle_unit(x, out_channels=int(576 * scale_factor), strides=2, groups=3, stage=3, block=2)
x = shuffle_unit(x, out_channels=int(960 * scale_factor), strides=2, groups=3, stage=3, block=3)
x = Conv2D(int(1024 * scale_factor), (1, 1), strides=1, padding='same',
use_bias=False, name='conv_last')(x)
x = BatchNormalization(axis=bn_axis, name='conv_last_bn')(x)
x = ReLU(name='conv_last_relu')(x)
x = GlobalAveragePooling2D(name='global_avg_pool')(x)
x = Dense(num_classes, activation='softmax', name='fc')(x)
model = Model(inputs=inputs, outputs=x, name='ShuffleNetV2')
return model
三、总结
通过本文可以看出,ShuffleNet是一种轻量级的卷积神经网络结构,具有低延迟和高准确率的优点。通过使用Channel Shuffle和Point-wise Group Convolution技术,可以减少参数和计算量,同时提高准确率。在Keras中实现ShuffleNet的过程中,需要借助于TensorFlow backend,使用卷积层和BatchNormalization层就可以实现ShuffleNet的结构。