什么是预训练网络
一个常用、高效的在小图像数据集上深度学习的方法就是利用预训练网络。一个预训练网络只是简单的储存了之前在大的数据集训练的结果,通常是大的图像分类任务。如果原始的数据集已经足够大,足够一般,通过预训练学习到的空间上的特征层次结构就能有效地在我们的模型中工作,因此这些特征对许多计算机视觉问题都很有用,尽管这些新问题和原任务相比可能涉及完全不同的类别。
Keras内置预训练网络 Keras库中包含
- VGG16、VGG19、
- ResNet50、
- Inception v3、
- Xception等经典的模型架构。
ImageNet是一个手动标注好类别的图片数据库(为了机器视 觉研究),目前已有22,000个类别。ImageNet项目是一个用于视觉对象识别软件研究的大型可视化数据库。超过1400万的图像URL被ImageNet手动注释,以指示图片中的对象;在至少一百万个图像中,还提供了边界框。ImageNet包含2万多个类别; [2]一个典型的类别,如“气球”或“草莓”,包含数百个图像。第三方图像URL的注释数据库可以直接从ImageNet免费获得;但是,实际的图像不属于ImageNet。自2010年以来,ImageNet项目每年举办一次软件比赛,即ImageNet大规模视觉识别挑战赛(ILSVRC),软件程序竞相正确分类检测物体和场景。 ImageNet挑战使用了一个“修剪”的1000个非重叠类的列表。2012年在解决ImageNet挑战方面取得了巨大的突破,被广泛认为是2010年的深度学习革命的开始。
VGG16与VGG19
在2014年,VGG模型架构由Simonyan和Zisserman提出, 在“极深的大规模图像识别卷积网络”(Very Deep Convolutional Networks for Large Scale Image Recognition)这篇论文中有介绍
VGG模型结构简单有效,前几层仅使用3×3卷积核来增加网 络深度,通过max pooling(最大池化)依次减少每层的神 经元数量,最后三层分别是2个有4096个神经元的全连接层 和一个softmax层。
看下VGG16架构
covn_base = keras.applications.VGG16(weights='imagenet',include_top='False')
covn_base.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 224, 224, 3) 0
_________________________________________________________________
block1_conv1 (Conv2D) (None, 224, 224, 64) 1792
_________________________________________________________________
block1_conv2 (Conv2D) (None, 224, 224, 64) 36928
_________________________________________________________________
block1_pool (MaxPooling2D) (None, 112, 112, 64) 0
_________________________________________________________________
block2_conv1 (Conv2D) (None, 112, 112, 128) 73856
_________________________________________________________________
block2_conv2 (Conv2D) (None, 112, 112, 128) 147584
_________________________________________________________________
block2_pool (MaxPooling2D) (None, 56, 56, 128) 0
_________________________________________________________________
block3_conv1 (Conv2D) (None, 56, 56, 256) 295168
_________________________________________________________________
block3_conv2 (Conv2D) (None, 56, 56, 256) 590080
_________________________________________________________________
block3_conv3 (Conv2D) (None, 56, 56, 256) 590080
_________________________________________________________________
block3_pool (MaxPooling2D) (None, 28, 28, 256) 0
_________________________________________________________________
block4_conv1 (Conv2D) (None, 28, 28, 512) 1180160
_________________________________________________________________
block4_conv2 (Conv2D) (None, 28, 28, 512) 2359808
_________________________________________________________________
block4_conv3 (Conv2D) (None, 28, 28, 512) 2359808
_________________________________________________________________
block4_pool (MaxPooling2D) (None, 14, 14, 512) 0
_________________________________________________________________
block5_conv1 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
block5_conv2 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
block5_conv3 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
block5_pool (MaxPooling2D) (None, 7, 7, 512) 0
_________________________________________________________________
flatten (Flatten) (None, 25088) 0
_________________________________________________________________
fc1 (Dense) (None, 4096) 102764544
_________________________________________________________________
fc2 (Dense) (None, 4096) 16781312
_________________________________________________________________
predictions (Dense) (None, 1000) 4097000
=================================================================
Total params: 138,357,544
Trainable params: 138,357,544
Non-trainable params: 0
使用VGG16处理猫狗数据集
import os
import shutil
base_dir = './猫狗数据集/cat_dog'
train_dir = os.path.join(base_dir,'train')
train_dir_dog = os.path.join(train_dir,'dog')
train_dir_cat = os.path.join(train_dir,'cat')
test_dir = os.path.join(base_dir,'test')
test_dir_dog = os.path.join(test_dir,'dog')
test_dir_cat = os.path.join(test_dir,'cat')
os.mkdir(base_dir)
os.mkdir(train_dir)
os.mkdir(train_dir_dog)
os.mkdir(train_dir_cat)
os.mkdir(test_dir)
os.mkdir(test_dir_dog)
os.mkdir(test_dir_cat)
dc_dir = './猫狗数据集/dc/train/'
fnames = ['cat.{}.jpg'.format(i) for i in range(1000)]
for fname in fnames:
s = os.path.join(dc_dir,fname)
d = os.path.join(train_dir_cat,fname)
shutil.copyfile(s,d)
fnames = ['cat.{}.jpg'.format(i) for i in range(1000,1500)]
for fname in fnames:
s = os.path.join(dc_dir,fname)
d = os.path.join(test_dir_cat,fname)
shutil.copyfile(s,d)
fnames = ['dog.{}.jpg'.format(i) for i in range(1000)]
for fname in fnames:
s = os.path.join(dc_dir,fname)
d = os.path.join(train_dir_dog,fname)
shutil.copyfile(s,d)
fnames = ['dog.{}.jpg'.format(i) for i in range(1000,1500)]
for fname in fnames:
s = os.path.join(dc_dir,fname)
d = os.path.join(test_dir_dog,fname)
shutil.copyfile(s,d)
import keras
from keras import layers
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(rescale=1/255)
test_datagen = ImageDataGenerator(rescale=1/255)
train_generator = train_datagen.flow_from_directory(train_dir,
target_size=(224,224),
batch_size=20,
class_mode='binary')
Found 2000 images belonging to 2 classes.
test_generator = train_datagen.flow_from_directory(test_dir,
target_size=(224,224),
batch_size=20,
class_mode='binary')
Found 1000 images belonging to 2 classes.
keras 内置经典网络实现
covn_base = keras.applications.VGG16(weights='imagenet',include_top='False')
covn_base.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_2 (InputLayer) (None, 224, 224, 3) 0
_________________________________________________________________
block1_conv1 (Conv2D) (None, 224, 224, 64) 1792
_________________________________________________________________
block1_conv2 (Conv2D) (None, 224, 224, 64) 36928
_________________________________________________________________
block1_pool (MaxPooling2D) (None, 112, 112, 64) 0
_________________________________________________________________
block2_conv1 (Conv2D) (None, 112, 112, 128) 73856
_________________________________________________________________
block2_conv2 (Conv2D) (None, 112, 112, 128) 147584
_________________________________________________________________
block2_pool (MaxPooling2D) (None, 56, 56, 128) 0
_________________________________________________________________
block3_conv1 (Conv2D) (None, 56, 56, 256) 295168
_________________________________________________________________
block3_conv2 (Conv2D) (None, 56, 56, 256) 590080
_________________________________________________________________
block3_conv3 (Conv2D) (None, 56, 56, 256) 590080
_________________________________________________________________
block3_pool (MaxPooling2D) (None, 28, 28, 256) 0
_________________________________________________________________
block4_conv1 (Conv2D) (None, 28, 28, 512) 1180160
_________________________________________________________________
block4_conv2 (Conv2D) (None, 28, 28, 512) 2359808
_________________________________________________________________
block4_conv3 (Conv2D) (None, 28, 28, 512) 2359808
_________________________________________________________________
block4_pool (MaxPooling2D) (None, 14, 14, 512) 0
_________________________________________________________________
block5_conv1 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
block5_conv2 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
block5_conv3 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
block5_pool (MaxPooling2D) (None, 7, 7, 512) 0
_________________________________________________________________
flatten (Flatten) (None, 25088) 0
_________________________________________________________________
fc1 (Dense) (None, 4096) 102764544
_________________________________________________________________
fc2 (Dense) (None, 4096) 16781312
_________________________________________________________________
predictions (Dense) (None, 1000) 4097000
=================================================================
Total params: 138,357,544
Trainable params: 138,357,544
Non-trainable params: 0
_________________________________________________________________
model = keras.Sequential()
model.add(covn_base)
model.add(layers.Dense(512,activation='relu'))
model.add(layers.Dense(1,activation='sigmoid'))
model.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
vgg16 (Model) (None, 1000) 138357544
_________________________________________________________________
dense_3 (Dense) (None, 512) 512512
_________________________________________________________________
dense_4 (Dense) (None, 1) 513
=================================================================
Total params: 138,870,569
Trainable params: 138,870,569
Non-trainable params: 0
_________________________________________________________________
covn_base.trainable =False
model.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
vgg16 (Model) (None, 1000) 138357544
_________________________________________________________________
dense_3 (Dense) (None, 512) 512512
_________________________________________________________________
dense_4 (Dense) (None, 1) 513
=================================================================
Total params: 138,870,569
Trainable params: 513,025
Non-trainable params: 138,357,544
_________________________________________________________________
model.compile(optimizer=keras.optimizers.Adam(lr=0.0001),loss='binary_crossentropy',metrics=['acc'])
histroy = model.fit_generator(train_generator,
epochs=5,
steps_per_epoch=100,
validation_data=test_generator,
validation_steps=50
)