行人检测0-05:LFFD-行人训练数据制作以及训练

版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
本文链接: https://blog.csdn.net/weixin_43013761/article/details/102603521

以下链接是个人关于LFFD(行人检测)所有见解,如有错误欢迎大家指出,我会第一时间纠正。有兴趣的朋友可以加微信:a944284742相互讨论技术。若是帮助到了你什么,一定要记得点赞!因为这是对我最大的鼓励,祝你年少且有为!
行人检测0-00:LFFD-史上最新无死角详细解读:https://blog.csdn.net/weixin_43013761/article/details/102592374

数据准备

我这里要讲解的是行人数据集的制作,怎么制作呢?我们先根据作者的流程走一遍,后面我们就知道怎么去制作自己的数据呢。通过论文或者README.md我们可以知道,其是再Caltech Pedestrian Dataset数据集上进行训练的。
在这里插入图片描述
通过下面的链接下载测试以及训练数据集Caltech Pedestrian Detection Benchmark(本人只下载了一小点,为大家做实验):
Caltech Pedestrian Detection Benchmark:http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/index.html
再上面的截图总,我们要注意一个点,那就是 with new annotation,说明作者使用的并不是数据集里面自带的annotations,那么他新的注释在哪里去寻找呢?通过询问作者(作者确实是一个比较好说话的人,个人挺喜欢的),得到以下链接,所以也分享给大家:
https://www.mpi-inf.mpg.de/departments/computer-vision-and-machine-learning/research/people-detection-pose-estimation-and-tracking/how-far-are-we-from-solving-pedestrian-detection/
通过链接往下翻阅,可以看到如下:
在这里插入图片描述
其上的红框就是新的注释。竟然都准备好了,我们就开始数据的制作吧

seq转jpg

下载完成之后,应该有很多数据集,但是本人只下载了set00,所以这里就给大家演示单独的set00制作数据集的过程,解压该文件夹之后显示如下:
在这里插入图片描述
其都为seq(视频)格式,但是大家之后,我们送入网络肯定是图片,不能是seq格式。那么我们当然需要先把seq转化为一张一张的图片,为了兼容作者的源码,尽量不动源码的东西,我编写了以下程序(编写文件为pedestrian_detection\data_provider_farm\seq2decode.py):

#!/usr/bin/env python
import os
import glob
import cv2


def save_imag(dname, fn, i, frame):

    file_name = '{}/{}/{}/images/I{:0>5d}.jpg'.format(out_dir, os.path.basename(dname),os.path.basename(fn).split('.')[0], i)
    print(file_name[:-10])
    if not os.path.exists(file_name[:-10]):
        os.makedirs(file_name[:-10])
    cv2.imwrite(file_name, frame)


out_dir = 'E:/1.PaidOn/6.Detection/1.pedestrian/1.LFFD/2.Dataset/1.OfficialData/1.Caltech/3.DoDeal-Jpg/1.train/'
if not os.path.exists(out_dir):
    os.makedirs(out_dir)

for dname in sorted(glob.glob('E:/1.PaidOn/6.Detection/1.pedestrian/1.LFFD/2.Dataset/1.OfficialData/1.Caltech/2.NotDeal-Seq\set00*')):
    for fn in sorted(glob.glob('{}/*.seq'.format(dname))):
        cap = cv2.VideoCapture(fn)
        i = 0
        while True:
            ret, frame = cap.read()
            if not ret:
                break
            save_imag(dname, fn, i, frame)
            i += 1
        print(fn)

大家方向生成文件路径太多,这是为了兼容作者的源程序,现在的时间点为2019/10/18,我不知道当你看到我这篇博客的时候是在那个时间点,所以不能保证此时也兼容你下载的源码。上面的程序,只要修改输出和输出目录即可。运行之后,可以看到输出路径下1.train\set00\V000\images:

在这里插入图片描述
这样,我们就成功把seq文件转化为图片呢。

生成data_list_caltech_train.txt

根据作者的的程序的需求,在生成data_list_caltech_train.pk之前,需要一个data_list_caltech_train.txt文件,其内容如下:

E:/1.PaidOn/6.Detection/1.pedestrian/1.LFFD/2.Dataset/1.OfficialData/1.Caltech/3.DoDeal-Jpg/1.train/set00\V000\images\I00029.jpg,0,0
E:/1.PaidOn/6.Detection/1.pedestrian/1.LFFD/2.Dataset/1.OfficialData/1.Caltech/3.DoDeal-Jpg/1.train/set00\V000\images\I00059.jpg,0,0
E:/1.PaidOn/6.Detection/1.pedestrian/1.LFFD/2.Dataset/1.OfficialData/1.Caltech/3.DoDeal-Jpg/1.train/set00\V000\images\I00089.jpg,0,0
E:/1.PaidOn/6.Detection/1.pedestrian/1.LFFD/2.Dataset/1.OfficialData/1.Caltech/3.DoDeal-Jpg/1.train/set00\V000\images\I00119.jpg,0,0
E:/1.PaidOn/6.Detection/1.pedestrian/1.LFFD/2.Dataset/1.OfficialData/1.Caltech/3.DoDeal-Jpg/1.train/set00\V000\images\I00149.jpg,0,0
E:/1.PaidOn/6.Detection/1.pedestrian/1.LFFD/2.Dataset/1.OfficialData/1.Caltech/3.DoDeal-Jpg/1.train/set00\V000\images\I00179.jpg,0,0
E:/1.PaidOn/6.Detection/1.pedestrian/1.LFFD/2.Dataset/1.OfficialData/1.Caltech/3.DoDeal-Jpg/1.train/set00\V000\images\I00209.jpg,1,1,482,182,15,34
E:/1.PaidOn/6.Detection/1.pedestrian/1.LFFD/2.Dataset/1.OfficialData/1.Caltech/3.DoDeal-Jpg/1.train/set00\V000\images\I00239.jpg,1,3,476,166,12,29,447,164,14,34,220,100,10,20
E:/1.PaidOn/6.Detection/1.pedestrian/1.LFFD/2.Dataset/1.OfficialData/1.Caltech/3.DoDeal-Jpg/1.train/set00\V000\images\I00269.jpg,1,1,532,160,19,44

大家可以看出,格式为:

[image absolute path],[pos/neg flag],[num of bboxes],[x1],[y1],[width1],[height1],[x2],[y2],[width2],[height2]......
[图片的绝对路径]    [正负样本标记]   [行人的数目]  [每个行人对应的左上角坐标以及宽和高]

其中,正样本标记为1,也就是图片总存在行人(至少一个),如果图片中没有行人,则为负样本。通过上面的介绍,只要为我们自己的数据,生成这样的格式txt文件,我们就能和作者的脚本进行兼容了。那么我们如何为Caltech Pedestrian Detection Benchmark数据集生成改文件呢?

找到作者pedestrian_detection\data_provider_farm\reformat_caltech.py程序,修改其中的路径:

# 该为重新下载的anno_train_1xnew目录,注意,老的不兼容该程序
annotation_root = 'E:/1.PaidOn/6.Detection/1.pedestrian/1.LFFD/2.Dataset/1.OfficialData/1.Caltech/3.DoDeal-Jpg/1.train/anno_train_1xnew'

# 图片的根目录,就是包含set00的目录
image_root = 'E:/1.PaidOn/6.Detection/1.pedestrian/1.LFFD/2.Dataset/1.OfficialData/1.Caltech/3.DoDeal-Jpg/1.train/'

# 生成.txt文件的路径
list_file_path = './data_folder/data_list_caltech_train.txt'

除了这些要修改,还有其他的也需要修改,我相信大家这个能力还是有的呢,修改完成之后,我们在程序的最下面可以看到:

if __name__ == '__main__':
    generate_data_list()
    #show_image()
    dataset_statistics()

其中的generate_data_list()不能注释掉,然后执行该程序即可。本人打印如下,表示制作完成:

53
54
55
56
57
58
59
60
61
shorter side based statistics:
[0-10): 1
[10-20): 27
[20-30): 5
longer side based statistics:
[10-20): 1
[20-30): 9
[30-40): 14
[40-50): 7
[50-60): 2
num pos: 21, num neg: 40
total pedestrian: 33

生成data_list_caltech_train.pkl

我们最终的目的是生成data_list_caltech_train.pkl,当我们生成了data_list_caltech_train.txt文件之后,就非常的简单了,修改路径之后,执行

pedestrian_detection\data_provider_farm\pickle_provider.py

即可,可以看到在pedestrian_detection\data_provider_farm\data_folder下面生成了data_list_caltech_train.pkl文件,那么训练数据就制作完成了(测试数据一样的流程,我就不再重复了)。

网络迭代

注意,接下来的报错,是本人使用pycharm出现的,如果你使用linux不一定会出现我的错误。行人检测网络的迭代,使用的是pedestrian_detection\config_farm\configuration_30_320_20L_4scales_v1.py第一次运行报错如下:

错误1:

D:\3.Anaconda3\python.exe E:/1.PaidOn/6.Detection/1.pedestrian/1.LFFD/A-Light-and-Fast-Face-Detector-for-Edge-Devices-master/pedestrian_detection/config_farm/configuration_30_320_20L_4scales_v1.py
2019-10-18 11:53:36,283[INFO]: Preparing before training.
2019-10-18 11:53:36,292[INFO]: Get net symbol successfully.
Traceback (most recent call last):
  File "E:/1.PaidOn/6.Detection/1.pedestrian/1.LFFD/A-Light-and-Fast-Face-Detector-for-Edge-Devices-master/pedestrian_detection/config_farm/configuration_30_320_20L_4scales_v1.py", line 313, in <module>
    run()
  File "E:/1.PaidOn/6.Detection/1.pedestrian/1.LFFD/A-Light-and-Fast-Face-Detector-for-Edge-Devices-master/pedestrian_detection/config_farm/configuration_30_320_20L_4scales_v1.py", line 205, in run
    from data_provider_farm.pickle_provider import PickleProvider
  File "..\data_provider_farm\pickle_provider.py", line 12, in <module>
    from text_list_adapter import TextListAdapter
ModuleNotFoundError: No module named 'text_list_adapter'

在文件上面码中添加如下:

sys.path.append('../')
sys.path.append('../data_provider_farm')

错误2:

D:\3.Anaconda3\python.exe E:/1.PaidOn/6.Detection/1.pedestrian/1.LFFD/A-Light-and-Fast-Face-Detector-for-Edge-Devices-master/pedestrian_detection/config_farm/configuration_30_320_20L_4scales_v1.py
2019-10-18 14:34:47,345[INFO]: Preparing before training.
2019-10-18 14:34:47,351[INFO]: Get net symbol successfully.
Traceback (most recent call last):
  File "E:/1.PaidOn/6.Detection/1.pedestrian/1.LFFD/A-Light-and-Fast-Face-Detector-for-Edge-Devices-master/pedestrian_detection/config_farm/configuration_30_320_20L_4scales_v1.py", line 313, in <module>
    run()
  File "E:/1.PaidOn/6.Detection/1.pedestrian/1.LFFD/A-Light-and-Fast-Face-Detector-for-Edge-Devices-master/pedestrian_detection/config_farm/configuration_30_320_20L_4scales_v1.py", line 208, in run
    train_data_provider = PickleProvider(param_trainset_pickle_file_path)
  File "..\data_provider_farm\pickle_provider.py", line 37, in __init__
    self.data = pickle.load(open(pickle_file_path, 'rb'))
FileNotFoundError: [Errno 2] No such file or directory: 'E:/1.PaidOn/6.Detection/1.pedestrian/1.LFFD/A-Light-and-Fast-Face-Detector-for-Edge-Devices-master/pedestrian_detection/config_farm\\../data_provider_farm/data_folder/data_list_caltech_train_source.pkl'

把我们生成pedestrian_detection\data_provider_farm\data_folder\data_list_caltech_train.pkl文件改名为data_list_caltech_train_source.pkl。

重新运行打印如下:

D:\3.Anaconda3\python.exe E:/1.PaidOn/6.Detection/1.pedestrian/1.LFFD/A-Light-and-Fast-Face-Detector-for-Edge-Devices-master/pedestrian_detection/config_farm/configuration_30_320_20L_4scales_v1.py
2019-10-18 11:56:28,065[INFO]: Preparing before training.
2019-10-18 11:56:28,074[INFO]: Get net symbol successfully.
2019-10-18 11:56:28,081[INFO]: Prepare the data provider for all dataiter threads ---- 
2019-10-18 11:56:28,081[INFO]: Dataset statistics:
	21 positive images;	40 negative images;	61 images in total.
2019-10-18 11:56:28,082[INFO]: MXNet Version: 1.5.0
2019-10-18 11:56:28,083[INFO]: Training settings:-----------------------------------------------------------------
2019-10-18 11:56:28,083[INFO]: param_log_mode:w
2019-10-18 11:56:28,083[INFO]: param_log_file_path:../log/configuration_30_320_20L_4scales_v1_2019-10-18-11-56-28.log
2019-10-18 11:56:28,083[INFO]: param_trainset_pickle_file_path:E:/1.PaidOn/6.Detection/1.pedestrian/1.LFFD/A-Light-and-Fast-Face-Detector-for-Edge-Devices-master/pedestrian_detection/config_farm\../data_provider_farm/data_folder/data_list_caltech_train_source.pkl
2019-10-18 11:56:28,083[INFO]: param_valset_pickle_file_path:
2019-10-18 11:56:28,083[INFO]: param_train_batch_size:32
2019-10-18 11:56:28,083[INFO]: param_neg_image_ratio:0.1
2019-10-18 11:56:28,083[INFO]: param_GPU_idx_list:[0]
2019-10-18 11:56:28,083[INFO]: param_net_input_height:480
2019-10-18 11:56:28,083[INFO]: param_net_input_width:480
......
......
[11:56:31] c:\jenkins\workspace\mxnet-tag\mxnet\src\operator\nn\cudnn\./cudnn_algoreg-inl.h:97: Running performance tests to find the best convolution algorithm, this can take a while... (set the environment variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)
2019-10-18 11:56:36,479[INFO]: Iter[10] -- Time elapsed: 4.8 s. Speed: 66.7 images/s.
2019-10-18 11:56:36,480[INFO]: CE_loss_score_0: --> 6578.8734
2019-10-18 11:56:36,480[INFO]: SE_loss_bbox_0: --> 4578.4687
2019-10-18 11:56:36,480[INFO]: CE_loss_score_1: --> 6501.4080
2019-10-18 11:56:36,480[INFO]: SE_loss_bbox_1: --> 5648.2725
2019-10-18 11:56:36,480[INFO]: CE_loss_score_2: --> 6709.0279
2019-10-18 11:56:36,480[INFO]: SE_loss_bbox_2: --> 2658.6539
2019-10-18 11:56:36,480[INFO]: CE_loss_score_3: --> 6363.4922
2019-10-18 11:56:36,480[INFO]: SE_loss_bbox_3: --> 0.0000
2019-10-18 11:56:41,776[INFO]: Iter[20] -- Time elapsed: 5.3 s. Speed: 60.4 images/s.
2019-10-18 11:56:41,777[INFO]: CE_loss_score_0: --> 5319.5090
2019-10-18 11:56:41,777[INFO]: SE_loss_bbox_0: --> 4811.2159
2019-10-18 11:56:41,777[INFO]: CE_loss_score_1: --> 5234.2500
2019-10-18 11:56:41,777[INFO]: SE_loss_bbox_1: --> 4698.7990
2019-10-18 11:56:41,777[INFO]: CE_loss_score_2: --> 5313.3269
2019-10-18 11:56:41,777[INFO]: SE_loss_bbox_2: --> 3125.8385
2019-10-18 11:56:41,777[INFO]: CE_loss_score_3: --> 4234.6200

表示训练成功,下小节开始,将开始对源码进行解读了。敬请期待!

猜你喜欢

转载自blog.csdn.net/weixin_43013761/article/details/102603521
今日推荐