2017-CVPR-Spindle Net: Person Re-identification with Human Body Region Guided Feature - 代码天地

2017-CVPR-Spindle Net: Person Re-identification with Human Body Region Guided Feature

其他 2018-11-22 20:56:40 阅读次数: 0

转载自：https://blog.csdn.net/weixin_41427758/article/details/82910295

论文地址：http://openaccess.thecvf.com/content_cvpr_2017/papers/Zhao_Spindle_Net_Person_CVPR_2017_paper.pdf

Motivation

由检测算法以及姿势变化引起的行人身体不对准问题会为不同图像间的特征匹配造成严重的影响 --> 怎么解决这个问题？

Contribution

首次在ReID中考虑人体结构信息：
- 帮助对齐不同图像中人体区域特征
- 增强局部细节信息的表示能力
SpindleNet
- a multi-stage ROI pooling framework --> 不同语义层次的特征在不同阶段进行提取
- a tree-structured fusion network + competitive strategy --> 合并不同语义层次的特征
真实监控场景的ReID数据集–SenseReID来评价算法的性能；本文的方法在大多数据集上达到了SOTA的方法

1. Introduction

ReID定义以及用途
- 跨摄像头或时间片段检索行人
- 主要在安防场景

ReID常见的挑战：
- 由于检测算法以及姿势变化，不同图像之间的行人身体存在不对准问题，如上图(a)
- 如何捕获易于区分的细节信息，如图(b)，头部区域对于两个图片有更强的判别力
- 遮挡问题：如何在比较过程中，降低遮挡区域的特征重要性
a tree-structured feature fusion strategy + a competitive strategy

2. Related Work

特征学习
度量学习
Video Based

3. Body Region Proposal Network

通过the Region Proposal Net- work (RPN)来产生7个身体区域
- 关键点定位
- 身体区域产生

1.step:定位输入图片的14个关键点

借鉴了CPM，利用sequential framework以由粗到细的方式来生成响应图，全卷积网络 --> 14个response map
- 在每个阶段，CNN提取特征并结合上一个阶段的响应图来refine关键点估计的位置
- 对CPM进行修改来降低其复杂度：
  - 共享前几层的卷积参数
  - 用s=2的卷积代替池化层
  - 减小了输入大小、阶段数、卷积层的通道数
14个关节点可以通过最大化特征图上的值得到：
Pi=[xi,yi]=argmaxx∈[1,X],y∈[1,Y]Fi(x,y)

Pi=[xi,yi]=argx∈[1,X],y∈[1,Y]maxFi(x,y)

2.step:产生7个身体区域

根据14个关节点生成3个宏观区域(头-肩，上体、下体)、4个微观区域(双腿、双臂)，具体可参考上图
RPN的训练：
- the MPII human pose dataset
- a Gaussian kernel
- Loss function：L2 distance

4. Body Region Guided Spindle Net

两个主要部分：
- the Feature Extraction Network (FEN)：输入为行人图片以及候选区域 ==> 计算全局特征与子区域特征
- the Feature Fusion Network (FFN)：合并不同区域的特征向量

4.1. Feature Extraction Network (FEN)

FEN由three convolution stages (FEN-C1, FEN-C2, FEN-C3)、two ROI pooling stages (FEN-P1, FEN-P2)：
- 1个全图 + 7个身体子部分每个产生256维向量
- sub-region的特征从全图的特征上在不同阶段crop得到
- 在FCN-C3后通过one global pooling layer and one inner product layer将输出转换为256维向量
下图表明了子区域特征的有效性

(b)、(e)为经过FEN-C1的全局特征，由该特征计算非对准的相同人的距离将远，相似人两个人距离较近
(c )、(f)为FEN-P1后的特征，利用该特征计算相似性对于非对准的相同人距离减小，相似人的距离增大

4.2. Feature Fusion Network (FFN)

FFN：将8个特征向量合并成为一个可以很好描述行人图片的256向量
fusion unit：进行特征融合过程，输入为大小相同的两个或多个特征向量，输出为合并后的特征向量
- The feature competition and selection process：element-wise maximization operation
- The feature transformation process:a inner product layer ==> 对应caffe里的全连接层
A tree-structured fusion strategy
- 根据子区域的不同语义层次与关系在不同的阶段将特征向量进行合并
- 双腿、双臂 --> 双腿结果+下体、双臂结果 + 上体 --> 上阶段结果 + 头-肩 --> 与全图的特征进行拼接并转换成256维向量
对头-肩、上体、下体的融合

4.3. Training Details

progressive strategy：先训练FEN、再训练FFN；权重全部随机初始化
FEN训练步骤：
- 先训练输入为全图
- 固定FEN-C1参数，训练三个宏观分支
- 固定FEN-C1、FEN-C2，训练四个微观分支
FFN由FEN产生的特征向量进行训练，Softmax

5. Experiments

5.1. Datasets

实验数据集以及划分策略如下表：

5.2. Comparison Results

在大多数据集上取得了SOTA方法

6. Investigations on Spindle Net

6.1. Investigations on FEN

ROI pooling得到宏观区域以及微观区域的最佳位置

由上图可以看到：
- Marco最佳为FEN-C1：macro包含更复杂的身份信息，应该更早的pool out来得到更多独立的学习参数
- Micro最佳为FEN-C2
全图特征与在不同阶段提取宏观及微观特征的组合实验

6.2. Investigations on FFN

测试每种特征的效果:全图 > 宏观 > 微观

树型融合策略与其他融合策略的对比：

7. Conclusion

本文提出的Spindle Net:
- a multi-stage ROI pooling network:分开提取不同身体区域特征
- tree-structured fusion network：合并不同身体区域特征
不同层次的身体特征有助于对齐不同行人图片的身体区域
通过实验验证了feature com- petition and fusion network的有效性
本文的方法在多个数据集上取得了SOTA的方法

猜你喜欢

转载自blog.csdn.net/jialibang/article/details/84315709

2017-CVPR-Spindle Net: Person Re-identification with Human Body Region Guided Feature

论文阅读笔记（三十九）【CVPR2017】：Spindle Net Person Re-identiﬁcation with Human Body Region Guided Feature Decomposition and Fusion

2018 CVPR-Human Semantic Parsing for Person Re-identification

Human Semantic Parsing for Person Re-identification

CVPR2018论文翻译 Human Semantic Parsing for Person Re-identification

CVPR19_Patch-based Discriminative Feature Learning for Unsupervised Person Re-identification

AlignedReID: Surpassing Human-Level Performance in Person Re-Identification

【论文阅读】Batch Feature Erasing for Person Re-identification and Beyond

2016 CVPR-Learning Deep Feature Representations with Domain Guided Dropout for Person Re-ID

Re-ID：AlignedReID: Surpassing Human-Level Performance in Person Re-Identification 论文解析

Unsupervised Salience Learning for Person Re-identification(CVPR2013)

2018-CVPR-Harmonious Attention Network for Person Re-Identification

Mask-guided Contrastive Attention Model for Person Re-Identification学习笔记

Mask-guided Contrastive Attention Model for Person Re-Identification 详解

Instance-Guided Context Rendering for Cross-Domain Person Re-Identification

Hard-sample Guided Hybrid Contrast Learning for Unsupervised Person Re-Identification

2022 TIP: Cluster-guided Asymmetric Contrastive Learning for Unsupervised Person Re-Identification

《AlignedReID: Surpassing Human-Level Performance in Person Re-Identification》论文解读

论文笔记---AlignedReID: Surpassing Human-Level Performance in Person Re-Identification

《Human Semantic Parsing for Person Re-identification》论文阅读之SPReID

Dual Attention Matching Network for Context-Aware Feature Sequence based Person Re-Identification笔记

Deep Spatial Feature Reconstruction for Partial Person Re-identification: Alignment-free Approach笔记

Deep Spatial Feature Reconstruction for Partial Person Re-identification: Freestyle Approach（CV18）

deep Spatial Feature Reconstruction for Partial Person Re-identification: Alignment-Free Approach

PartialReID-----Deep Spatial Feature Reconstruction for Partial Person Re-identification: Freestyle Approach（CV18）

Patch-based Discriminative Feature Learning for Unsupervised Person Re-identification

Towards Rich Feature Discovery with Class Activation Maps Augmentation for Person Re-Identification

论文阅读之Cross-modality Person re-identification with Shared-Specific Feature Transfer(上)

Person Re-identification：SPReID

2018 CVPR-Person Transfer GAN to Bridge Domain Gap for Person Re-Identification

今日推荐

开源日报 | Chrome内置Gemini的意义不在于Gemini；中国AI追随之路的五大误区；ECharts创始人“下海”养鱼；谷歌I/O开发者大会什么都有，只是没有惊喜

微软回应中国区AI团队“打包赴美”传闻

基于大语言模型的开源知识库问答系统 MaxKB GitHub Star 数量突破 5,000 个！

美国拟限制 AI 大模型出口中国和俄罗斯

苹果将与 OpenAI 达成协议，将 ChatGPT 应用于 iPhone

openKylin 社区生态委员会第六次会议圆满召开

阿里云正式发布通义千问 2.5

Python 3.13 发布首个 Beta：实验性自由线程模式和 JIT、改进交互式解释器

Stack Overflow 拿我的代码去训练 AI 大模型，还封了我的账号

Pop!_OS 的 COSMIC 桌面完成 App Store 上架工作

《2024 年一季度互联网投融资运行情况》研究报告

报告：Django 仍然是 74% 开发者的首选

周排行

返回指定时间格式

fopen函数中的mode参数

Java 单例模式探讨

Flex remoteobject工作原理探讨

寻找mplayer的便捷安装方法

30天了解30种技术系列---(26)MySQL自动化运维工具Inception

关于Jboss/Tomcat/Jetty的JNDI定义123

程序减肥，strip，eu-strip 及其符号表

AsyncTask、View.post(Runnable)、ViewTreeObserver三种方式总结frame animation自动启动

Json和Bean的互相转换

每日归档

更多

2024-05-15(24)

2024-05-14(0)

2024-05-13(18)

2024-05-12(0)

2024-05-11(38)

2024-05-10(38)

2024-05-09(35)

2024-05-08(42)

2024-05-07(14)

2024-05-06(40)