Complete set first contact with the rookie level, the recording process. If wrong, thanks for the correction.
Preparatory
System and the environment
win7 system of old and the CPU
anaconda3 and tensorflow environment
jupyter
OpenCV library
pycharm
etc.
Code
SSD-Tensorflow (source)
to extract the SSD-Tensorflow-master folder
two files checkpoints subfolders in this folder directly extract
Test code
1. A method
to open a terminal in the folder, run:
jupyter notebook notebooks/ssd_notebook.ipynb
2. Method Two
own py create a script named SSD_detect.py, stored in the SSD-Tensorflow-master, the code is as follows:
import os
import math
import random
import numpy as np
import tensorflow as tf
import cv2
slim = tf.contrib.slim
#%matplotlib inline
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import sys
#sys.path.append('../')
from nets import ssd_vgg_300, ssd_common, np_methods
from preprocessing import ssd_vgg_preprocessing
from notebooks import visualization
# TensorFlow session: grow memory when needed. TF, DO NOT USE ALL MY GPU MEMORY!!!
gpu_options = tf.GPUOptions(allow_growth=True)
config = tf.ConfigProto(log_device_placement=False, gpu_options=gpu_options)
isess = tf.InteractiveSession(config=config)
# Input placeholder.
net_shape = (300, 300)
data_format = 'NHWC'
img_input = tf.placeholder(tf.uint8, shape=(None, None, 3))
# Evaluation pre-processing: resize to SSD net shape.
image_pre, labels_pre, bboxes_pre, bbox_img = ssd_vgg_preprocessing.preprocess_for_eval(
img_input, None, None, net_shape, data_format, resize=ssd_vgg_preprocessing.Resize.WARP_RESIZE)
image_4d = tf.expand_dims(image_pre, 0)
# Define the SSD model.
reuse = True if 'ssd_net' in locals() else None
ssd_net = ssd_vgg_300.SSDNet()
with slim.arg_scope(ssd_net.arg_scope(data_format=data_format)):
predictions, localisations, _, _ = ssd_net.net(image_4d, is_training=False, reuse=reuse)
# Restore SSD model.
ckpt_filename = 'checkpoints/ssd_300_vgg.ckpt'
# ckpt_filename = '../checkpoints/VGG_VOC0712_SSD_300x300_ft_iter_120000.ckpt'
isess.run(tf.global_variables_initializer())
saver = tf.train.Saver()
saver.restore(isess, ckpt_filename)
# SSD default anchor boxes.
ssd_anchors = ssd_net.anchors(net_shape)
# Main image processing routine.
def process_image(img, select_threshold=0.5, nms_threshold=.45, net_shape=(300, 300)):
# Run SSD network.
rimg, rpredictions, rlocalisations, rbbox_img = isess.run([image_4d, predictions, localisations, bbox_img],
feed_dict={img_input: img})
# Get classes and bboxes from the net outputs.
rclasses, rscores, rbboxes = np_methods.ssd_bboxes_select(
rpredictions, rlocalisations, ssd_anchors,
select_threshold=select_threshold, img_shape=net_shape, num_classes=21, decode=True)
rbboxes = np_methods.bboxes_clip(rbbox_img, rbboxes)
rclasses, rscores, rbboxes = np_methods.bboxes_sort(rclasses, rscores, rbboxes, top_k=400)
rclasses, rscores, rbboxes = np_methods.bboxes_nms(rclasses, rscores, rbboxes, nms_threshold=nms_threshold)
# Resize bboxes to original image shape. Note: useless for Resize.WARP!
rbboxes = np_methods.bboxes_resize(rbbox_img, rbboxes)
return rclasses, rscores, rbboxes
# Test on some demo image and visualize output.
path = 'demo/'
image_names = sorted(os.listdir(path))
img = mpimg.imread(path + image_names[-5])
rclasses, rscores, rbboxes = process_image(img)
# visualization.bboxes_draw_on_img(img, rclasses, rscores, rbboxes, visualization.colors_plasma)
visualization.plt_bboxes(img, rclasses, rscores, rbboxes)
In the direct input terminal python SSD_detect.py
can be run
Preparation of data sets
Creating VOC2007 folder, convenience, is still stored in SSD-Tensorflow-master in
the VOC2007 under, create three subfolders, named Annotations , imagesets , JPEGImages
1.JPEGImages
loading image files (.jpg or .jpeg)
Note Pictures naming must be 000001.jpg, 000002.jpg ... form.
There are many ways to rename, refer to this relatively simple to use homemade VOC2007 dataset Maker
2.Annotations
loaded label file training (.xml)
production methods: labelimg marked
3.ImageSets
in which to create subfolders Main , store train .txt, trainval.txt, test.txt, val.txt
generated code:
import os
import random
xmlfilepath=r'自己的Annotations文件路径'
saveBasePath=r'自己的ImageSets文件路径'
trainval_percent=0.7
train_percent=0.7
total_xml = os.listdir(xmlfilepath)
num=len(total_xml)
list=range(num)
tv=int(num*trainval_percent)
tr=int(tv*train_percent)
trainval= random.sample(list,tv)
train=random.sample(trainval,tr)
print("train and val size",tv)
print("traub suze",tr)
ftrainval = open(os.path.join(saveBasePath,'Main/trainval.txt'), 'w')
ftest = open(os.path.join(saveBasePath,'Main/test.txt'), 'w')
ftrain = open(os.path.join(saveBasePath,'Main/train.txt'), 'w')
fval = open(os.path.join(saveBasePath,'Main/val.txt'), 'w')
for i in list:
name=total_xml[i][:-4]+'\n'
if i in trainval:
ftrainval.write(name)
if i in train:
ftrain.write(name)
else:
fval.write(name)
else:
ftest.write(name)
ftrainval.close()
ftrain.close()
fval.close()
ftest .close()
After generating the VOC2007 built outside another folder VOCtest, storage test.txt.
Network Training
Code changes
In the following part all of pycharm, the SSD-Tensorflow-mastet as project folder open.
1.SSD-Tensorflow-mastet / datasets / pascalvoc_common.py file, line 24-46, none of the first type do not move, other classes to modify their type of dataset.
VOC default 20 categories, plus a background class, so a total of 21 categories. Own training set according to actual situation
I modified after the:
VOC_LABELS = {
'none': (0, 'Background'),
'car': (1, 'Car'),
'aeroplane': (2, 'Vehicle'),
'bicycle': (3, 'Vehicle'),
# 'aeroplane': (1, 'Vehicle'),
# 'bicycle': (2, 'Vehicle'),
# 'bird': (3, 'Animal'),
# 'boat': (4, 'Vehicle'),
# 'bottle': (5, 'Indoor'),
# 'bus': (6, 'Vehicle'),
# 'car': (7, 'Vehicle'),
# 'cat': (8, 'Animal'),
# 'chair': (9, 'Indoor'),
# 'cow': (10, 'Animal'),
# 'diningtable': (11, 'Indoor'),
# 'dog': (12, 'Animal'),
# 'horse': (13, 'Animal'),
# 'motorbike': (14, 'Vehicle'),
# 'person': (15, 'Person'),
# 'pottedplant': (16, 'Indoor'),
# 'sheep': (17, 'Animal'),
# 'sofa': (18, 'Indoor'),
# 'train': (19, 'Vehicle'),
# 'tvmonitor': (20, 'Indoor'),
}
2.SSD-Tensorflow-master / datasets / pascalvoc_to_tfrecords.py file, line 82, corresponding to the format .jpg
or .jpeg
(if not already in the form of changes jpg), 83 rows r
instead rb
.
67 line modify SAMPLES_PER_FILES
parameters, set a few pictures into a tfrecord file.
I set the value is 1.
3.SSD-Tensorflow-master / nets / ssd_vgg_300.py file, line 96-97 and no_annotation_label num_classes changed 类别数+1
.
4.SSD-Tensorflow-master / eval_ssd_network.py file, line 66, to modify num_classes 类别数+1
.
5.SSD-Tensorflow-master / datasets / pascalvoc_2007.py file, the first 31 rows and 55 rows none
class does not move, other types of modifying their own class data set, wherein the first number in parentheses is the number of pictures, the first two atoms of the target number (i.e. the number of bonding box), 52 and row lines 76 total
is the sum of all classes. Number 79 and 80 lines instead of the total number of training and testing set their own data set, 86 lines NUM_CLASSES
changed their number of categories (not a plus).
Calculation script named collect_class.py:
import re
import os
import xml.etree.ElementTree as ET
class1 = '自己类别1'
class2 = '自己类别2'
class3 = '自己类别3'
annotation_folder = '自己标签文件夹的路径'
list = os.listdir(annotation_folder)
def file_name(file_dir):
L = []
for root, dirs, files in os.walk(file_dir):
for file in files:
if os.path.splitext(file)[1] == '.xml':
L.append(os.path.join(root, file))
return L
total_number1 = 0
total_number2 = 0
total_number3 = 0
pic_num1 = 0
pic_num2 = 0
pic_num3 = 0
flag1 = 0
flag2 = 0
flag3 = 0
xml_dirs = file_name(annotation_folder)
total_pic =0
total =0
for i in range(0, len(xml_dirs)):
print(xml_dirs[i])
#path = os.path.join(annotation_folder,list[i])
#print(path)
annotation_file = open(xml_dirs[i],encoding='UTF-8').read()
root = ET.fromstring(annotation_file)
#tree = ET.parse(annotation_file)
#root = tree.getroot()
total_pic = total_pic + 1
for obj in root.findall('object'):
label = obj.find('name').text
if label == class1:
total_number1=total_number1+1
flag1=1
total = total + 1
#print("bounding box number:", total_number1)
if label == class2:
total_number2=total_number2+1
flag2=1
total = total + 1
if label == class3:
total_number3=total_number3+1
flag3=1
total = total + 1
if flag1==1:
pic_num1=pic_num1+1
#print("pic number:", pic_num1)
flag1=0
if flag2==1:
pic_num2=pic_num2+1
flag2=0
if flag3==1:
pic_num3=pic_num3+1
flag3=0
print(class1,pic_num1,total_number1)
print(class2,pic_num2,total_number2)
print(class3,pic_num3, total_number3)
print("total", total_pic, total)
Can be filled by the output value.
6.SSD-Tensorflow-master / train_ssd_network.py file, modify the 135 line num_classes
is 类别数+1
.
Modify line 154 'None' is the maximum number of training (such as 50000) step, when training will always be set to None, you need to manually stop. You can also modify the batch size, leaning rate this file, and so on ......
may not modify, amend wait until the last run
Generate tfrecord file
SSD-Tensorflow-master / tf_convert_data.py file, modify 'dataset_dir', 'output_name', 'output_dir' respectively VOC2007 own folder path name of the output data set, the output path. as follows:
tf.app.flags.DEFINE_string(
'dataset_name', 'pascalvoc',
'The name of the dataset to convert.')
tf.app.flags.DEFINE_string(
'dataset_dir', r'F:/cjc/SSD-Tensorflow-master/VOC2007/',
'Directory where the original dataset is stored.')
tf.app.flags.DEFINE_string(
'output_name', 'voc_2007_train',
'Basename used for TFRecords output files.')
tf.app.flags.DEFINE_string(
'output_dir', r'F:/cjc/SSD-Tensorflow-master/tfrecords/',
'Output directory where to store TFRecords files.')
Save and run.
training
Before running the SSD-Tensorflow-master / train_ssd_network.py, in the console: run- edit configurations - parameters, edit:
#每600s保存一下模型
--weight_decay=0.0005 \
#正则化的权值衰减的系数
--optimizer=adam \
#选取的最优化函数
--learning_rate=0.00001 \
#学习率
--learning_rate_decay_factor=0.94 \
#学习率的衰减因子
--batch_size=4 \
--gpu_memory_fraction=0.4
Save and run.
Note
1.ssd support CPU or GPU computing. But must pay attention, batch_size GPU under no circumstances can not be set too large, otherwise the computer will get stuck. Set the safe side 4
.
2. If there is CPU BiasOp
an error, put the line 27 DATA_FORMAT
changed to NHWC
3. If global_step not increased, in line 367
config = tf.ConfigProto(log_device_placement=False,
gpu_options=gpu_options)
Then add a line of code:
config.gpu_options.per_process_gpu_memory_fraction = ***
//(***可自行设置)
4. If the $ {DATASET_DIR} appear: εͳ \ udcd5Ҳ \ udcbb \ udcb5 such errors, put a file absolute path to relative path: './ DATASET_DIR'
5. If the interpretation of the file appears UnicodeDecodeError: 'gbk' codec can not decode, put the open ( '... / report.html', mode = 'rb') to open ( '... / report.html', mode = 'rb', encoding = 'UTF-8')
6. Note that the difference between 2 and 3 of the python. File path written in the left slash, do not write the right slash.