caffe从零开始学习4——cifar-10例程训练及测试

前言

Cifar-10是由Hinton的两个大弟子Alex Krizhevsky和Ilya Sutskever收集的一个用于普通物体识别的数据集。DL的两大核心:数据+模型。

CIFAR-10(DataSet)这个数据集总共包含:60000张图片

1—图片尺寸:32pixel*32pixel

2—图片深度:三通道RGB的彩色图片

2—这60000张图片共分为10类,具体的分类如下图所示:
在这里插入图片描述
60000张图片里面有:
1–50000张训练样本
2–10000张测试样本(验证Set)

备注:
1—CIFAR-10:是一个[普通物体]识别的数据集
2—因此,这个数据集和网络模型的最大特点就是:可以很容易的将[物体识别]迁移到其他普通的物体
3—而且可以将10分类问题扩展至100类物体的分类,甚至1000类和更多类的物体分类。
注意的一点是:

示例中的数据集存在一个L

1—100003072的numpy的数组中------10000张图片每张图片的像素数组

2—单位是uint8s

3—3072存储了一个3232的彩色图片(33232==31024==3072)

4—numpy的前1024位是RGB中的R分量像素值,中间的1024位是G分量的像素值,最后的1024是B分量的像素值

5—最后注意的一点是:
CIFAR-10这个例子只能用于[小图片]的分类,正如前面讲的Mnist示例,主要用于[手写数字的识别一样]。

下载数据

首先,在data/cifar10目录下有个脚本文件:get_cifar10.sh,其源码如下

#!/usr/bin/env sh
# This scripts downloads the CIFAR10 (binary version) data and unzips it.

DIR="$( cd "$(dirname "$0")" ; pwd -P )"
cd "$DIR"

echo "Downloading..."

wget --no-check-certificate http://www.cs.toronto.edu/~kriz/cifar-10-binary.tar.gz

echo "Unzipping..."

tar -xf cifar-10-binary.tar.gz && rm -f cifar-10-binary.tar.gz
mv cifar-10-batches-bin/* . && rm -rf cifar-10-batches-bin

# Creation is split out because leveldb sometimes causes segfault
# and needs to be re-created.

echo "Done."
          

其中–no-check-certificate是使用“不检查证书” 选项。
下载完成如下:
在这里插入图片描述

数据处理

在Caffe根目录examples/cifar10/下有个create_cifar10.sh脚本,是进行数据转换的脚本:

#!/usr/bin/env sh
# This script converts the cifar data into leveldb format.
set -e

EXAMPLE=examples/cifar10
DATA=data/cifar10
DBTYPE=lmdb

echo "Creating $DBTYPE..."

rm -rf $EXAMPLE/cifar10_train_$DBTYPE $EXAMPLE/cifar10_test_$DBTYPE

./build/examples/cifar10/convert_cifar_data.bin $DATA $EXAMPLE $DBTYPE

echo "Computing image mean..."

./build/tools/compute_image_mean -backend=$DBTYPE \
  $EXAMPLE/cifar10_train_$DBTYPE $EXAMPLE/mean.binaryproto

echo "Done."

上述脚本中,利用convert_cifar_data.bin可执行程序转换得到LMDB格式数据,并利用compute_image_mean程序生成均值文件。

/examples/cifar10/create_cifar10.sh 

在这里插入图片描述

进行训练

在这里插入图片描述
cifar10文件下有quick类的文件,也有一堆full的文件,主要区别是quick只训练5000个,full是全部都训练,50000个。
由于我是虚拟机里进行训练,很慢,这里我进行quick训练。
如果使用Gpu的话,也可以进行full全部训练。

vim   train_quick.sh
#!/usr/bin/env sh
set -e

TOOLS=./build/tools

$TOOLS/caffe train \
  --solver=examples/cifar10/cifar10_quick_solver.prototxt $@

# reduce learning rate by factor of 10 after 8 epochs
$TOOLS/caffe train \
  --solver=examples/cifar10/cifar10_quick_solver_lr1.prototxt \
  --snapshot=examples/cifar10/cifar10_quick_iter_4000.solverstate $@

继续查看脚本中cifar10_quick_solver.prototxt文件,

# reduce the learning rate after 8 epochs (4000 iters) by a factor of 10

# The train/test net protocol buffer definition
net: "examples/cifar10/cifar10_quick_train_test.prototxt"
# test_iter specifies how many forward passes the test should carry out.
# In the case of MNIST, we have test batch size 100 and 100 test iterations,
# covering the full 10,000 testing images.
test_iter: 100
# Carry out testing every 500 training iterations.
test_interval: 500
# The base learning rate, momentum and the weight decay of the network.
base_lr: 0.001  #4000样本以下学习率是0.001
momentum: 0.9
weight_decay: 0.004
# The learning rate policy
lr_policy: "fixed"
# Display every 100 iterations
display: 100
# The maximum number of iterations
max_iter: 4000
# snapshot intermediate results
snapshot: 4000
snapshot_prefix: "examples/cifar10/cifar10_quick"
# solver mode: CPU or GPU
solver_mode: CPU

继续查看脚本中cifar10_quick_solver_lr1.prototxt文件,

# reduce the learning rate after 8 epochs (4000 iters) by a factor of 10

# The train/test net protocol buffer definition
net: "examples/cifar10/cifar10_quick_train_test.prototxt"
# test_iter specifies how many forward passes the test should carry out.
# In the case of MNIST, we have test batch size 100 and 100 test iterations,
# covering the full 10,000 testing images.
test_iter: 100
# Carry out testing every 500 training iterations.
test_interval: 500
# The base learning rate, momentum and the weight decay of the network.
base_lr: 0.0001  #4000-5000样本学习率是0.0001,降成原来的十分之一了
momentum: 0.9
weight_decay: 0.004
# The learning rate policy
lr_policy: "fixed"
# Display every 100 iterations
display: 100
# The maximum number of iterations
max_iter: 5000
# snapshot intermediate results
snapshot: 5000
snapshot_format: HDF5
snapshot_prefix: "examples/cifar10/cifar10_quick"
# solver mode: CPU or GPU
solver_mode: CPU

继续查看cifar10_quick_train_test.prototxt文件

name: "CIFAR10_quick"
layer {
  name: "cifar"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    mean_file: "examples/cifar10/mean.binaryproto"
  }
  data_param {
    source: "examples/cifar10/cifar10_train_lmdb"
    batch_size: 100
    backend: LMDB
  }
}
layer {
  name: "cifar"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    mean_file: "examples/cifar10/mean.binaryproto"
  }
  data_param {
    source: "examples/cifar10/cifar10_test_lmdb"
    batch_size: 100
    backend: LMDB
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 32
    pad: 2
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "gaussian"
      std: 0.0001
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "pool1"
  top: "pool1"
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 32
    pad: 2
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu2"
  type: "ReLU"
  bottom: "conv2"
  top: "conv2"
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: AVE
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "conv3"
  type: "Convolution"
  bottom: "pool2"
  top: "conv3"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 64
    pad: 2
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu3"
  type: "ReLU"
  bottom: "conv3"
  top: "conv3"
}
layer {
  name: "pool3"
  type: "Pooling"
  bottom: "conv3"
  top: "pool3"
  pooling_param {
    pool: AVE
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "pool3"
  top: "ip1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 64
    weight_filler {
      type: "gaussian"
      std: 0.1
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 10
    weight_filler {
      type: "gaussian"
      std: 0.1
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip2"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}

上述网络文件的含义可参考之前的文章:传送门
为了进一步分析网络结构,绘制网络图,更方便分析。

python ./python/draw_net.py  ./examples/cifar10/cifar10_quick_train_test.prototxt ./examples/cifar10/cifar10_quick.png --rankdir=LR
eog /examples/cifar10/cifar10_quick.png

如下:
在这里插入图片描述
开始训练,用cpu训练也太慢了吧。。。。。。。

./train_quick.sh 

在这里插入图片描述
从训练结果看,训练5000次后accuracy=0.7595, loss=0.72916。

测试或推理

测试可以使用caffe自带的classification.bin程序进行分类,也可以使用Ppython/classify.py的python程序进行测试
准备测试图片,在CAFFE_ROOT/examples/images下有cat.jpg等图片,也可以自己添加其他类别。

(1)使用classification.bin程序进行分类测试

Usage: ./build/examples/cpp_classification/classification.bin deploy.prototxt network.caffemodel mean.binaryproto labels.txt img.jpg

./build/examples/cpp_classification/classification.bin \
examples/cifar10/cifar10_quick.prototxt \
examples/cifar10/cifar10_quick_iter_5000.caffemodel.h5 \
examples/cifar10/mean.binaryproto \
data/cifar10/batches.meta.txt \
examples/images/cat.jpg

执行报错:输出标签是11,与网络的输出10不一致。
在这里插入图片描述
修改batches.meta.txt 文件,把batches.meta.txt的最后一行空格删除。重新执行ok.
在这里插入图片描述
测试结果竟然是deer概率最大,小猫的概率在0.0261,这准确率也太低了吧,汗!!!!
(2)使用python/classify.py的python程序进行测试
查看classify.py文件源码

#!/usr/bin/env python
"""
classify.py is an out-of-the-box image classifer callable from the command line.

By default it configures and runs the Caffe reference ImageNet model.
"""
import numpy as np
import os
import sys
import argparse
#argparse是python标准库里面用来处理命令行参数的库
#使用步骤:
#(1)import argparse    首先导入模块
#(2)parser = argparse.ArgumentParser()    创建一个解析对象
#(3)parser.add_argument()    向该对象中添加你要关注的命令行参数和选项
#(4)parser.parse_args()    进行解析
import glob
import time

import caffe


def main(argv):
    pycaffe_dir = os.path.dirname(__file__)

    parser = argparse.ArgumentParser()
    # Required arguments: input and output files.
    parser.add_argument(
        "input_file",
        help="Input image, directory, or npy."
    )
    parser.add_argument(
        "output_file",
        help="Output npy filename."
    )
    # Optional arguments.
    parser.add_argument(
        "--model_def",
        default=os.path.join(pycaffe_dir,
                "../models/bvlc_reference_caffenet/deploy.prototxt"),
        help="Model definition file."
    )
    parser.add_argument(
        "--pretrained_model",
        default=os.path.join(pycaffe_dir,
                "../models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel"),
        help="Trained model weights file."
    )
    parser.add_argument(
        "--gpu",
        action='store_true',
        help="Switch for gpu computation."
    )
    parser.add_argument(
        "--center_only",
        action='store_true',
        help="Switch for prediction from center crop alone instead of " +
             "averaging predictions across crops (default)."
    )
    parser.add_argument(
        "--images_dim",
        default='256,256',
        help="Canonical 'height,width' dimensions of input images."
    )
    parser.add_argument(
        "--mean_file",
        default=os.path.join(pycaffe_dir,
                             'caffe/imagenet/ilsvrc_2012_mean.npy'),
        help="Data set image mean of [Channels x Height x Width] dimensions " +
             "(numpy array). Set to '' for no mean subtraction."
    )
    parser.add_argument(
        "--input_scale",
        type=float,
        help="Multiply input features by this scale to finish preprocessing."
    )
    parser.add_argument(
        "--raw_scale",
        type=float,
        default=255.0,
        help="Multiply raw input by this scale before preprocessing."
    )
    parser.add_argument(
        "--channel_swap",
        default='2,1,0',
        help="Order to permute input channels. The default converts " +
             "RGB -> BGR since BGR is the Caffe default by way of OpenCV."
    )
    parser.add_argument(
        "--ext",
        default='jpg',
        help="Image file extension to take as input when a directory " +
             "is given as the input file."
    )
    args = parser.parse_args()

    image_dims = [int(s) for s in args.images_dim.split(',')]

    mean, channel_swap = None, None
    if args.mean_file:
        mean = np.load(args.mean_file)
    if args.channel_swap:
        channel_swap = [int(s) for s in args.channel_swap.split(',')]

    if args.gpu:
        caffe.set_mode_gpu()
        print("GPU mode")
    else:
        caffe.set_mode_cpu()
        print("CPU mode")

    # Make classifier.
    classifier = caffe.Classifier(args.model_def, args.pretrained_model,
            image_dims=image_dims, mean=mean,
            input_scale=args.input_scale, raw_scale=args.raw_scale,
            channel_swap=channel_swap)

    # Load numpy array (.npy), directory glob (*.jpg), or image file.
    args.input_file = os.path.expanduser(args.input_file)
    if args.input_file.endswith('npy'):
        print("Loading file: %s" % args.input_file)
        inputs = np.load(args.input_file)
    elif os.path.isdir(args.input_file):
        print("Loading folder: %s" % args.input_file)
        inputs =[caffe.io.load_image(im_f)
                 for im_f in glob.glob(args.input_file + '/*.' + args.ext)]
    else:
        print("Loading file: %s" % args.input_file)
        inputs = [caffe.io.load_image(args.input_file)]

    print("Classifying %d inputs." % len(inputs))

    # Classify.
    start = time.time()
    predictions = classifier.predict(inputs, not args.center_only)
    print("Done in %.2f s." % (time.time() - start))

    # Save
    print("Saving results into %s" % args.output_file)
    np.save(args.output_file, predictions)


if __name__ == '__main__':
    main(sys.argv)

需要修改均值计算,这个均值计算是错误的。
在classify.py文件找到

mean = np.load(args.mean_file)  

在下面加上一行:

mean=mean.mean(1).mean(1) 

在这里插入图片描述
增加print打印测试的结果,在python/classify.py添加:

print("Predictions:%s" % predictions)

在这里插入图片描述
运行测试

python python/classify.py \
--model_def examples/cifar10/cifar10_quick.prototxt \
--pretrained_model  examples/cifar10/cifar10_quick_iter_5000.caffemodel.h5 \
examples/images/cat.jpg  save.file

在这里插入图片描述
预测的第4个类别的可能性最大,为0.243992。查看data/cifar10/batches.meta.txt文件第4个类别是cat.
在这里插入图片描述
上面结果来看,cifar10数据集预测结果准确率很低,这里仅是实验测试用。

猜你喜欢

转载自blog.csdn.net/u014470361/article/details/100056011
今日推荐