首页 > 分享 > 【神经网络】(6) 卷积神经网络（VGG16），案例：鸟类图片4分类

【神经网络】(6) 卷积神经网络（VGG16），案例：鸟类图片4分类

萌宠菠菠乐园
2024-12-04 19:25

各位同学好，今天和大家分享一下TensorFlow2.0中的VGG16卷积神经网络模型，案例：现在有四种鸟类的图片各200张，构建卷积神经网络，预测图片属于哪个分类。

1. 数据加载

将鸟类图片按类分开存放，使用tf.keras.preprocessing.image_dataset_from_directory()函数分批次读取图片数据，统一指定图片加载进来的大小224*224，指定参数label_model，'int'代表目标值y是数值类型，即0, 1, 2, 3等；'categorical'代表onehot类型，对应索引的值为1，如图像属于第二类则表示为0,1,0,0,0；'binary'代表二分类。

import numpy as np

import tensorflow as tf

from tensorflow import keras

from tensorflow.keras import Sequential, optimizers, layers, Model

def get_data(height, width, batchsz):

filepath1 = 'C:/Users/admin/.spyder-py3/test/数据集/4种鸟分类/new_data/train'

train_ds = tf.keras.preprocessing.image_dataset_from_directory(

filepath1,

label_mode = 'categorical',

image_size = (height, width),

batch_size = batchsz,

)

filepath2 = 'C:/Users/admin/.spyder-py3/test/数据集/4种鸟分类/new_data/val'

val_ds = tf.keras.preprocessing.image_dataset_from_directory(

filepath1,

label_mode = 'categorical',

image_size = (height, width),

batch_size = batchsz,

)

filepath2 = 'C:/Users/admin/.spyder-py3/test/数据集/4种鸟分类/new_data/test'

test_ds = tf.keras.preprocessing.image_dataset_from_directory(

filepath1,

label_mode = 'categorical',

image_size = (height, width),

batch_size = batchsz,

)

return train_ds, val_ds, test_ds

train_ds, val_ds, test_ds = get_data(height=224, width=224, batchsz=32)

class_names = train_ds.class_names

print('类别有：', class_names)

sample = next(iter(train_ds))

print('x_batch.shape:', sample[0].shape, 'y_batch.shape:',sample[1].shape)

print('y[:5]:', sample[1][:5])

2. 数据预处理

使用.map()函数转换数据集中所有x和y的类型，并将每张图象的像素值映射到[-1,1]之间，打乱训练集数据的顺序.shuffle()，但不改变特征值x和标签值y之间的对应关系。iter()生成迭代器，配合next()每次运行取出训练集中的一个batch数据。

import matplotlib.pyplot as plt

for i in range(15):

plt.subplot(3,5,i+1)

plt.imshow(sample[0][i]/255.0)

plt.xticks([])

plt.yticks([])

plt.show()

def processing(x, y):

x = 2 * tf.cast(x, dtype=tf.float32)/255.0 - 1

y = tf.cast(y, dtype=tf.int32)

return x,y

train_ds = train_ds.map(processing).shuffle(10000)

val_ds = val_ds.map(processing)

test_ds = test_ds.map(processing)

sample = next(iter(train_ds))

print('x_batch.shape:', sample[0].shape, 'y_batch.shape:',sample[1].shape)

print('y[:5]:', sample[1][:5])

鸟类图像如下：

3. VGG16网络构造

VGG16的模型框架如下图所示，原理见下文：深度学习-VGG16原理详解。

1）输入图像尺寸为224x224x3，经64个通道为3的3x3的卷积核，步长为1，padding=same填充，卷积两次，再经ReLU激活，输出的尺寸大小为224x224x64

2）经max pooling（最大化池化），滤波器为2x2，步长为2，图像尺寸减半，池化后的尺寸变为112x112x64

3）经128个3x3的卷积核，两次卷积，ReLU激活，尺寸变为112x112x128

4）max pooling池化，尺寸变为56x56x128

5）经256个3x3的卷积核，三次卷积，ReLU激活，尺寸变为56x56x256

6）max pooling池化，尺寸变为28x28x256

7）经512个3x3的卷积核，三次卷积，ReLU激活，尺寸变为28x28x512

8）max pooling池化，尺寸变为14x14x512

9）经512个3x3的卷积核，三次卷积，ReLU，尺寸变为14x14x512

10）max pooling池化，尺寸变为7x7x512

11）然后Flatten()，将数据拉平成向量，变成一维51277=25088。

11）再经过两层1x1x4096，一层1x1x1000的全连接层（共三层），经ReLU激活

12）最后通过softmax输出1000个预测结果

下面通过代码来实现，这里我们需要的是4分类，因此把最后的1000个预测结果改为4既可。

def VGG16(input_shape=(224,224,3), output_shape=4):

input_tensor = keras.Input(shape=input_shape)

x = layers.Conv2D(64, (3,3), activation='relu', strides=1, padding='same')(input_tensor)

x = layers.Conv2D(64, (3,3), activation='relu' , strides=1, padding='same')(x)