在之前的文章中，对tensorflow目标检测API进行了详细的测试，成功应用其模型做简单的检测任务。本文对另一模块DeepLab的API进行测试，实现语义分割。

经过了好几天的吐血折腾，终于将该模块调通，其中的bug真是数不胜数……

1 文件结构

首先在research/deeplab/datasets下新建一个文件夹，这里我建的是loulan，用来做漏缆的语义分割。然后在文件夹下新建dataset、init_models、tfrecord和train子文件夹

loulan
.
├── dataset
│   ├── changepixel.py
│   ├── ImageSets
│   ├── JPEGImages
│   ├── SegmentationClass
│   ├── SegmentationClassRaw
│   ├── SegmentationClass(RGBA)
│   └── trans2raw.py
├── export
│   ├── frozen_inference_graph-2675.pb
│   ├── frozen_inference_graph-2675.tar.gz
├── export_model.sh
├── init_models
│   ├── deeplabv3_pascal_train_aug
│   └── deeplabv3_pascal_train_aug_2018_01_04.tar.gz
├── tfrecord
│   ├── train-00000-of-00004.tfrecord
│   ├── train-00001-of-00004.tfrecord
│   ├── train-00002-of-00004.tfrecord
│   ├── train-00003-of-00004.tfrecord
│   ├── trainval-00000-of-00004.tfrecord
│   ├── trainval-00001-of-00004.tfrecord
│   ├── trainval-00002-of-00004.tfrecord
│   ├── trainval-00003-of-00004.tfrecord
│   ├── val-00000-of-00004.tfrecord
│   ├── val-00001-of-00004.tfrecord
│   ├── val-00002-of-00004.tfrecord
│   └── val-00003-of-00004.tfrecord
└── train
    ├── checkpoint
    ├── graph.pbtxt
    ├── model.ckpt-2675.data-00000-of-00001
    ├── model.ckpt-2675.index
    ├── model.ckpt-2675.meta

2 准备数据

这一步操作都在loulan/dataset文件夹中。

2.1 原始数据

将原始JPEG图片打包放在JPEGImages文件夹下，标注文件放在SegmentationClass文件夹下。这块要注意原始图片和标注图片都是RGB格式的

神坑一：我找人标注的图片格式是RGBA，而且是用A元素来区分的类别，RGB分量是无规律的，因此我要先把图片的像素更改过来，用RGB格式来区分图片中的种类。具体转化的代码，见我另一篇博客《Python之修改图片像素值》。

2.2 转化灰度图

上一步将标注文件也转化为了RGB格式的，接下来要将标注文件转化为单通道的灰度图，用不同的像素值来表示不同的类别。代码如下。（由于我目标只有1类，因此只需要用2种像素即可）

import tensorflow as tf
from PIL import Image
from tqdm import tqdm
import numpy as np

import os, shutil

# palette (color map) describes the (R, G, B): Label pair
palette = {(0,0,0) : 0 ,    #0表示背景
         (255,255,255) : 1  #1表示类别
         }

def convert_from_color_segmentation(arr_3d):
    arr_2d = np.zeros((arr_3d.shape[0], arr_3d.shape[1]), dtype=np.uint8)

    for c, i in palette.items():
        m = np.all(arr_3d == np.array(c).reshape(1, 1, 3), axis=2)
        arr_2d[m] = i
    return arr_2d


label_dir = './SegmentationClass/'
new_label_dir = './SegmentationClassRaw/'

if not os.path.isdir(new_label_dir):
	print("creating folder: ",new_label_dir)
	os.mkdir(new_label_dir)
else:
	print("Folder alread exists. Delete the folder and re-run the code!!!")


label_files = os.listdir(label_dir)

for l_f in tqdm(label_files):
    arr = np.array(Image.open(label_dir + l_f))
    arr = arr[:,:,0:3]
    arr_2d = convert_from_color_segmentation(arr)
    Image.fromarray(arr_2d).save(new_label_dir + l_f)

2.3 数据分类

在loulan/dataset/ImageSets文件夹下新建三个文本文件，分别是train.txt、trainval.txt、val.txt，然后将标注的数据的文件名按照训练集：验证集 = 4:1的比例分别保存于train.txt和val.txt，然后把所有的名字保存在trainval.txt中。可参考我以下方法保存

# -*- coding:utf8 -*-
import os

path = 'JPEGImages/'
filelist = os.listdir(path)
i = 0

#训练集保存方法
with open("train.txt","a") as f:
        for item in filelist:
                i += 1
                #print('item name is ',item)
                if i%5 != 0:
                        name = item.split('.',2)[0] + '.' + item.split('.',2)[1]
                        print('name is ',name)
                        f.write(name + '\n')

其它两个文件的生成方法类似，稍加改动就行

2.4 生成tfrecord

在deeplab/datasets文件夹下，有生成tfrecord的程序，因为我们采用的是voc数据结构，因此需运行build_voc2012_data.py，具体参数如下

python ./build_voc2012_data.py --image_folder="./loulan/dataset/JPEGImages" --semantic_segmentation_folder="./loulan/dataset/SegmentationClassRaw" --list_folder="./loulan/dataset/ImageSets" --image_format='jpg' --output_dir="./loulan/tfrecord"

3 网络训练

3.1 预训练模型下载

官方提供了不少预训练模型，这里以deeplabv3_pascal_train_aug_2018_01_04为例

扫描二维码关注公众号，回复： 4235900 查看本文章

在loulan/init_models文件夹中下载解压模型，得到deeplabv3_pascal_train_aug文件夹

wget http://download.tensorflow.org/models/deeplabv3_pascal_train_aug_2018_01_04.tar.gz
tar zxf deeplabv3_pascal_train_aug_2018_01_04.tar.gz

3.2 数据集描述

在deeplab/datasets文件夹下，有一个segmentation_dataset.py文件，用来描述我们数据集，打开之后添加如下代码

首先在_DATASETS_INFORMATION中添加如下

_DATASETS_INFORMATION = {
    'cityscapes': _CITYSCAPES_INFORMATION,
    'pascal_voc_seg': _PASCAL_VOC_SEG_INFORMATION,
    'ade20k': _ADE20K_INFORMATION,
    'loulan': _LOULAN_INFORMATION
}

然后与_ADE20K_INFORMATION并列关系，添加如下代码

_PQR_SEG_INFORMATION = DatasetDescriptor(
    splits_to_sizes={
        'train': 824, # 训练集的图片数量
        'trainval': 1030,  #所有图片数量
        'val': 206,  #验证集的图片数量
    },
    num_classes=4, # 数据类别，包含背景
    ignore_label=255, # 忽略的类别
)

其中，将自己图片的数量依次对应修改，然后在num_classes中，我写的是4，参考了《tensorflow下deeplab模型训练数据集过程（续）》，他说：

“ 数据集的类别设置为4，是因为还有两个默认的类别，分别是ignored_label（255）和-1. 从名字可以看出，ignored_label表示忽略该类别，即不考虑该类别，主要是在制作数据集的时候需要用到，而-1是为了保证过程中出现一些未知的类别。两者都是为了保证训练过程不报错。

于是可能您会说，维护什么不将训练中的非道路类别归为255或者-1.这个我的确做了，我一开始的确是将非道路视为255的，这样实际的目标类别就只有一类，但是训练出来的效果并不好。于是再将类别多加一类后，效果好了很多。”

具体还没有测试过，这里先这么修改

还有一篇文章里提到要修改deeplab/train.py文件：

# Set to False if one does not want to re-use the trained classifier weights.
flags.DEFINE_boolean('initialize_last_layer', True,
                     'Initialize the last layer.')

把True改为False，重新训练最后一层

3.3 开始训练

在deeplab下，有一个train.py文件，我们只需要把参数设置好即可，具体如下

python train.py --logtostderr --train_split="train" --model_variant="xception_65" --atrous_rates=6 --atrous_rates=12 --atrous_rates=18 --output_stride=16 --decoder_output_stride=4 --train_crop_size=512 --train_crop_size=512 --train_batch_size=4 --training_number_of_steps=30000 --fine_tune_batch_norm=false --tf_initial_checkpoint="./datasets/pascal_voc_seg/init_models/deeplabv3_pascal_train_aug/model.ckpt" --train_logdir="./datasets/loulan/train/" --dataset_dir="./datasets/loulan/tfrecord" --num_clones=4

训练的时候有几个参数要注意：--num_clones=4 表示在4块GPU上进行训练，要根据自己显卡数量进行配置

这里要注意正确的设置train_batch_size的大小，我刚开始设置了16，运行时会报显存不足的错误。然后修改为4，同时要设置fine_tune_batch_norm为false，如果显卡足够强，batch_size能设置的大于12时，要将fine_tune_batch_norm设置为True。batch_size设置的大的时候，训练速度会变慢

训练的时候，可以用tensorboard --logdir='train' 来观察loss的变化，如果loss趋于稳定，就可以停止训练了

3.4 导出模型

训练后在loulan/train文件夹下生成一些结果文件，如

graph.pbtxt
model.ckpt-1000.data-00000-of-00001
model.ckpt-1000.info
model.ckpt-1000.meta

其中meta文件保存了graph和metadata,ckpt文件保存了网络的weights，进行预测时有模型的权重就够了，可以使用官方提供的脚本来生成模型文件，脚本文件是deeplab文件夹下export_model.py，我们在deeplab/datasets/loulan文件夹下新建export_model.sh脚本，输入以下内容：

python ../../export_model.py --logtostderr --checkpoint_path="train/model.ckpt-$1" --export_path="export/frozen_inference_graph-$1.pb" --model_variant="xception_65" --atrous_rates=6 --atrous_rates=12  --atrous_rates=18 --output_stride=16 --decoder_output_stride=4  --num_classes=21  --crop_size=513  --crop_size=513 --inference_scales=1.0

然后运行 sh export_model.sh 2675即可，其中2675是train文件夹下最新的生成文件，即有以下三个文件

model.ckpt-2675.data-00000-of-000001
model.ckpt-2675.index
model.ckpt-2675.meta

运行时要根据自己的文件来修改数字，运行后在export文件夹下生成了frozen_inference_graph-2675.pb文件，该模型可以在后边用来做检测。

4 模型测试

4.1 ipynb方法

在deeplab目录下，有一个deeplab_demo.ipynb文件，我们只需要修改里边的路径就可以

4.1.1 修改类别

找到以下代码修改为自己的类别

LABEL_NAMES = np.asarray([
    'background', 'aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus',
    'car', 'cat', 'chair', 'cow', 'diningtable', 'dog', 'horse', 'motorbike',
    'person', 'pottedplant', 'sheep', 'sofa', 'train', 'tv'
])

比如我只有两个类别，则改为

LABEL_NAMES = np.asarray([
    'background', 'lane'
])

4.1.2 载入模型

首先要将上边生成的pb文件压缩一下，我直接导入测试的时候报错，查看源码发现是有一个解压的过程，因此首先把文件压缩后再导入，压缩指令如下

tar zcvf frozen_inference_graph-2675.tar.gz frozen_inference_graph-2675.pb

然后把下载模型的部分注释掉，添加载入模型

#@title Select and download models {display-mode: "form"}

'''
MODEL_NAME = 'mobilenetv2_coco_voctrainaug'  # @param 

中间省略

MODEL = DeepLabModel(download_path)
'''

model_path ='./datasets/loulan/export/frozen_inference_graph-8070.tar.gz'
MODEL = DeepLabModel(model_path)

print('model loaded successfully!')

4.1.3 添加测试图片

将最后一段下载url图片的程序注释掉，添加自己图片路径运行模型即可，如下

image_path = './datasets/loulan/dataset/JPEGImages/19678.15(20181011135748557_0).jpg'
original_im = Image.open(image_path)
resized_im, seg_map = MODEL.run(original_im)
vis_segmentation(resized_im, seg_map)

然后点击cell-run all即可运行出结果，效果如下

参考：https://lijiancheng0614.github.io/2018/03/13/2018_03_13_TensorFlow-DeepLab/

参考：https://medium.freecodecamp.org/how-to-use-deeplab-in-tensorflow-for-object-segmentation-using-deep-learning-a5777290ab6b

TensorFlow之deeplab语义分割API接口调试