首页 > 分享 > 【神经网络】(6) 卷积神经网络(VGG16),案例:鸟类图片4分类

【神经网络】(6) 卷积神经网络(VGG16),案例:鸟类图片4分类

各位同学好,今天和大家分享一下TensorFlow2.0中的VGG16卷积神经网络模型,案例:现在有四种鸟类的图片各200张,构建卷积神经网络,预测图片属于哪个分类。

1. 数据加载

将鸟类图片按类分开存放,使用tf.keras.preprocessing.image_dataset_from_directory()函数分批次读取图片数据,统一指定图片加载进来的大小224*224,指定参数label_model'int'代表目标值y是数值类型,即0, 1, 2, 3等;'categorical'代表onehot类型,对应索引的值为1,如图像属于第二类则表示为0,1,0,0,0;'binary'代表二分类

import numpy as np

import tensorflow as tf

from tensorflow import keras

from tensorflow.keras import Sequential, optimizers, layers, Model

def get_data(height, width, batchsz):

filepath1 = 'C:/Users/admin/.spyder-py3/test/数据集/4种鸟分类/new_data/train'

train_ds = tf.keras.preprocessing.image_dataset_from_directory(

filepath1,

label_mode = 'categorical',

image_size = (height, width),

batch_size = batchsz,

)

filepath2 = 'C:/Users/admin/.spyder-py3/test/数据集/4种鸟分类/new_data/val'

val_ds = tf.keras.preprocessing.image_dataset_from_directory(

filepath1,

label_mode = 'categorical',

image_size = (height, width),

batch_size = batchsz,

)

filepath2 = 'C:/Users/admin/.spyder-py3/test/数据集/4种鸟分类/new_data/test'

test_ds = tf.keras.preprocessing.image_dataset_from_directory(

filepath1,

label_mode = 'categorical',

image_size = (height, width),

batch_size = batchsz,

)

return train_ds, val_ds, test_ds

train_ds, val_ds, test_ds = get_data(height=224, width=224, batchsz=32)

class_names = train_ds.class_names

print('类别有:', class_names)

sample = next(iter(train_ds))

print('x_batch.shape:', sample[0].shape, 'y_batch.shape:',sample[1].shape)

print('y[:5]:', sample[1][:5])

2. 数据预处理

使用.map()函数转换数据集中所有x和y的类型,并将每张图象的像素值映射到[-1,1]之间,打乱训练集数据的顺序.shuffle(),但不改变特征值x和标签值y之间的对应关系。iter()生成迭代器,配合next()每次运行取出训练集中的一个batch数据

import matplotlib.pyplot as plt

for i in range(15):

plt.subplot(3,5,i+1)

plt.imshow(sample[0][i]/255.0)

plt.xticks([])

plt.yticks([])

plt.show()

def processing(x, y):

x = 2 * tf.cast(x, dtype=tf.float32)/255.0 - 1

y = tf.cast(y, dtype=tf.int32)

return x,y

train_ds = train_ds.map(processing).shuffle(10000)

val_ds = val_ds.map(processing)

test_ds = test_ds.map(processing)

sample = next(iter(train_ds))

print('x_batch.shape:', sample[0].shape, 'y_batch.shape:',sample[1].shape)

print('y[:5]:', sample[1][:5])

鸟类图像如下:

3. VGG16网络构造

VGG16的模型框架如下图所示,原理见下文:深度学习-VGG16原理详解 。

1)输入图像尺寸为224x224x3,经64个通道为3的3x3的卷积核,步长为1,padding=same填充,卷积两次,再经ReLU激活,输出的尺寸大小为224x224x64

2)经max pooling(最大化池化),滤波器为2x2,步长为2,图像尺寸减半,池化后的尺寸变为112x112x64

3)经128个3x3的卷积核,两次卷积,ReLU激活,尺寸变为112x112x128

4)max pooling池化,尺寸变为56x56x128

5)经256个3x3的卷积核,三次卷积,ReLU激活,尺寸变为56x56x256

6)max pooling池化,尺寸变为28x28x256

7)经512个3x3的卷积核,三次卷积,ReLU激活,尺寸变为28x28x512

8)max pooling池化,尺寸变为14x14x512

9)经512个3x3的卷积核,三次卷积,ReLU,尺寸变为14x14x512

10)max pooling池化,尺寸变为7x7x512

11)然后Flatten(),将数据拉平成向量,变成一维51277=25088。

11)再经过两层1x1x4096,一层1x1x1000的全连接层(共三层),经ReLU激活

12)最后通过softmax输出1000个预测结果
 

下面通过代码来实现,这里我们需要的是4分类,因此把最后的1000个预测结果改为4既可。

def VGG16(input_shape=(224,224,3), output_shape=4):

input_tensor = keras.Input(shape=input_shape)

x = layers.Conv2D(64, (3,3), activation='relu', strides=1, padding='same')(input_tensor)

x = layers.Conv2D(64, (3,3), activation='relu' , strides=1, padding='same')(x)

x = layers.MaxPool2D(pool_size=(2,2), strides=(2,2))(x)

x = layers.Conv2D(128, (3,3), activation='relu', strides=1, padding='same')(x)

x = layers.Conv2D(128, (3,3), activation='relu', strides=1, padding='same')(x)

x = layers.MaxPool2D(pool_size=(2,2), strides=(2,2))(x)

x = layers.Conv2D(256, (3,3), activation='relu', strides=1, padding='same')(x)

x = layers.Conv2D(256, (3,3), activation='relu', strides=1, padding='same')(x)

x = layers.Conv2D(256, (3,3), activation='relu', strides=1, padding='same')(x)

x = layers.MaxPool2D(pool_size=(2,2), strides=(2,2))(x)

x = layers.Conv2D(512, (3,3), activation='relu', strides=1, padding='same')(x)

x = layers.Conv2D(512, (3,3), activation='relu', strides=1, padding='same')(x)

x = layers.Conv2D(512, (3,3), activation='relu', strides=1, padding='same')(x)

x = layers.MaxPool2D(pool_size=(2,2), strides=(2,2))(x)

x = layers.Conv2D(512, (3,3), activation='relu', strides=1, padding='same')(x)

x = layers.Conv2D(512, (3,3), activation='relu', strides=1, padding='same')(x)

x = layers.Conv2D(512, (3,3), activation='relu', strides=1, padding='same')(x)

x = layers.MaxPool2D(pool_size=(2,2), strides=(2,2))(x)

x = layers.Flatten()(x)

x = layers.Dense(4096, activation='relu')(x)

x = layers.Dense(4096, activation='relu')(x)

output_tensor = layers.Dense(output_shape)(x)

model = Model(inputs=input_tensor, outputs=output_tensor)

return model

model = VGG16()

model.summary()

该网络构架如下

Model: "model"

_________________________________________________________________

Layer (type) Output Shape Param

=================================================================

input_1 (InputLayer) [(None, 224, 224, 3)] 0

conv2d (Conv2D) (None, 224, 224, 64) 1792

conv2d_1 (Conv2D) (None, 224, 224, 64) 36928

max_pooling2d (MaxPooling2D (None, 112, 112, 64) 0

)

conv2d_2 (Conv2D) (None, 112, 112, 128) 73856

conv2d_3 (Conv2D) (None, 112, 112, 128) 147584

max_pooling2d_1 (MaxPooling (None, 56, 56, 128) 0

2D)

conv2d_4 (Conv2D) (None, 56, 56, 256) 295168

conv2d_5 (Conv2D) (None, 56, 56, 256) 590080

conv2d_6 (Conv2D) (None, 56, 56, 256) 590080

max_pooling2d_2 (MaxPooling (None, 28, 28, 256) 0

2D)

conv2d_7 (Conv2D) (None, 28, 28, 512) 1180160

conv2d_8 (Conv2D) (None, 28, 28, 512) 2359808

conv2d_9 (Conv2D) (None, 28, 28, 512) 2359808

max_pooling2d_3 (MaxPooling (None, 14, 14, 512) 0

2D)

conv2d_10 (Conv2D) (None, 14, 14, 512) 2359808

conv2d_11 (Conv2D) (None, 14, 14, 512) 2359808

conv2d_12 (Conv2D) (None, 14, 14, 512) 2359808

max_pooling2d_4 (MaxPooling (None, 7, 7, 512) 0

2D)

flatten (Flatten) (None, 25088) 0

dense (Dense) (None, 4096) 102764544

dense_1 (Dense) (None, 4096) 16781312

dense_2 (Dense) (None, 4) 16388

=================================================================

Total params: 134,276,932

Trainable params: 134,276,932

Non-trainable params: 0

_________________________________________________________________

4. 网络编译

在网络编译时.compile(),指定损失loss采用交叉熵损失,设置参数from_logits=True,由于网络的输出层没有使用softmax函数将输出的实数转为概率,参数设置为True时,会自动将logits的实数转为概率值,再和真实值计算损失,这里的真实值y是经过onehot编码之后的结果。

opt = optimizers.Adam(learning_rate=1e-4)

model.compile(optimizer=opt,

loss=keras.losses.CategoricalCrossentropy(from_logits=True),

metrics=['accuracy'])

history = model.fit(train_ds, validation_data=val_ds, epochs=30)

train_acc = history.history['accuracy']

val_acc = history.history['val_accuracy']

train_loss = history.history['loss']

val_loss = history.history['val_loss']

epochs_range = range(len(train_acc))

plt.figure(figsize=(10,5))

plt.subplot(1,2,1)

plt.plot(epochs_range, train_acc, label='train_acc')

plt.plot(epochs_range, val_acc, label='val_acc')

plt.legend()

plt.subplot(1,2,2)

plt.plot(epochs_range, train_loss, label='train_loss')

plt.plot(epochs_range, val_loss, label='val_loss')

plt.legend()

5. 结果展示

如图可见网络效果预测较好,在迭代至25次左右时网络准确率达到99%左右,如果迭代次数较多的话,可考虑在编译时使用early stopping保存最优权重,若后续网络效果都没有提升就可以提早停止网络,节约训练时间。

训练过程中的损失和准确率如下

Epoch 1/30

13/13 [==============================] - 7s 293ms/step - loss: 1.3627 - accuracy: 0.3116 - val_loss: 1.3483 - val_accuracy: 0.5075

Epoch 2/30

13/13 [==============================] - 3s 173ms/step - loss: 1.1267 - accuracy: 0.5251 - val_loss: 1.0235 - val_accuracy: 0.5226

------------------------------------------------------------------------------------------

省略N行

------------------------------------------------------------------------------------------

Epoch 26/30

13/13 [==============================] - 2s 174ms/step - loss: 0.1184 - accuracy: 0.9874 - val_loss: 0.1093 - val_accuracy: 0.9774

Epoch 27/30

13/13 [==============================] - 2s 174ms/step - loss: 0.3208 - accuracy: 0.9196 - val_loss: 0.2678 - val_accuracy: 0.9347

Epoch 28/30

13/13 [==============================] - 2s 172ms/step - loss: 0.2366 - accuracy: 0.9322 - val_loss: 0.1247 - val_accuracy: 0.9648

Epoch 29/30

13/13 [==============================] - 3s 173ms/step - loss: 0.1027 - accuracy: 0.9648 - val_loss: 0.0453 - val_accuracy: 0.9849

Epoch 30/30

13/13 [==============================] - 3s 171ms/step - loss: 0.0491 - accuracy: 0.9849 - val_loss: 0.0250 - val_accuracy: 0.9925

6. 其他方法

如果想更灵活的计算损失和准确率,可以不使用.compile(),.fit()函数。在模型构建完之后,自己敲一下代码实现前向传播,同样能实现模型训练效果。下面的代码可以代替第4小节中的第(5)步

optimizer = optimizers.Adam(learning_rate=1e-5)

train_acc = []

train_loss = []

val_acc = []

val_loss = []

for epochs in range(30):

train_total_sum=0

train_total_loss=0

train_total_correct=0

val_total_sum=0

val_total_loss=0

val_total_correct=0

for step, (x,y) in enumerate(train_ds):

with tf.GradientTape() as tape:

logits = model(x)

prob = tf.nn.softmax(logits, axis=1)

predict = tf.argmax(prob, axis=1, output_type=tf.int32)

correct = tf.cast(tf.equal(predict, y), dtype=tf.int32)

correct = tf.reduce_sum(correct)

total = x.shape[0]

train_total_sum += total

train_total_correct += correct

acc = correct/total

train_acc.append(acc)

y = tf.one_hot(y, depth=4)

loss = tf.losses.categorical_crossentropy(y, logits, from_logits=True)

loss_avg = tf.reduce_mean(loss, axis=1)

train_total_loss += tf.reduce_sum(loss)

grads = tape.gradient(loss_avg, model.trainable_variables)

optimizer.apply_gradients(zip(grads, model.trainable_variables))

if step%20 == 0:

print('train', 'step:', step, 'loss:', loss_avg, 'acc:', acc)

train_acc.append(train_total_correct/train_total_sum)

train_loss.append(train_total_loss/train_total_sum)

for step, (x, y) in enumerate(val_ds):

logits = model(x)

prob = tf.nn.softmax(logits, axis=1)

predict = tf.argmax(prob, axis=1, output_type=tf.int32)

correct = tf.cast(tf.equal(predict, y), dtype=tf.int32)

correct = tf.reduce_sum(correct)

val_total_correct += correct

total = x.shape[0]

val_total_sum += total

acc = correct/total

val_acc.append(acc)

y = tf.one_hot(y, depth=4)

loss = tf.losses.categorical_crossentropy(y, logits, from_logits=True)

loss_avg = tf.reduce_mean(loss, axis=1)

val_total_loss += tf.reduce_sum(loss)

if step%10 == 0:

print('val', 'step:', step, 'loss:', loss_avg, 'acc:', acc)

val_acc.append(val_total_correct/val_total_sum)

val_loss.append(val_total_loss/val_total_sum)

相关知识

深度学习之基于Vgg16卷积神经网络的宠物皮肤疾病识别系统
猫狗分类#1猫狗图片准备【keras深度学习】CNN卷积神经网络 cats
基于卷积神经网络的宠物猫品种分类研究 csdn
卷积神经网络
基于卷积神经网络的宠物识别
基于卷积神经网络的鸟类(声音和图片)识别项目
基于卷积神经网络的宠物猫品种分类研究
基于卷积神经网络的猫种类的识别
鸟类识别系统Python+卷积神经网络算法+深度学习+人工智能+TensorFlow+ResNet50算法模型+图像识别
一种基于深度卷积神经网络的禽类动物行为识别方法与流程

网址: 【神经网络】(6) 卷积神经网络(VGG16),案例:鸟类图片4分类 https://m.mcbbbk.com/newsview689893.html

所属分类:萌宠日常
上一篇: 使用Tensorflow 2进行
下一篇: 以西伯利亚森林猫为宠物的饲养方法