在这里插入图片描述
CVPR-2019 workshop

code：https://github.com/SoftwareGift/FeatherNets_Face-Anti-spoofing-Attack-Detection-Challenge-CVPR2019

文章目录

1 Background and Motivation
2 Related Work
3 Advantages / Contributions
4 Method
- 4.1 FeatherNet Architecture Design
- 4.2 Multi-Modal Fusion Method
5 Experiments
6 Conclusion（own）

1 Background and Motivation

人脸活检对人脸识别系统的重要性不言而喻，随着科技的发展，人脸识别系统渐渐被部署到了 mobile or embedded environments 中去了。这不仅对活检算法的精度提出了高的要求，同时速度也尤为重要！

因此，作者 to address the issues of computational and storage costs，设计了 FeatherNets 轻量级网络，力求又快又准的进行 face presentation attacks detection！

2 Related Work

Traditional
crafted features detection（LBP / SIFT / SURF / HOG / DoG） + SVM / Random Forest 二分类（liveness or not）
CNN based
- RGB frame
- RGB frame + depth or rPPG（remote photoplethysmography）
- RGB + depth + IR（infrared）

本文作者只用到了 depth + IR 的信息

3 Advantages / Contributions

third place in CVPR 2019 ChaLearn Face Anti-spoofing attack detection
提出了轻量级的 face anti-spoofing 网络结构—— FeatherNet，with Streaming Module（替换 Global Average Pooling，GAP）
采用了 ensemble + cascade 的模型 fusion 方式
收集公开了新的数据集——Multi-Modal Face Dataset（MMFD），用作 data augmentation（让 depth 图更有层次感）

4 Method

1）Streaming module

2）ensemble + cascade（fusion procedure）

4.1 FeatherNet Architecture Design

1）The Weakness of GAP for Face Task

在这里插入图片描述

GAP 是把 H×W×C 的 feature maps 平均 pooling 成 1×1×C，

《Understanding the effective receptive field in deep convolutional neural networks》（NIPS-2016）

在 face 应用中，输入图片只有 face，因此中心的比边缘的点应该更重要，RF1 比 RF2 有更大的 effective 感受野，也即空间上不同的位置重要性不太一样，GAP 的 equal importance（用人均 GDP 来衡量每个人的财富情况）显得不太合适

用全连接可以规避 GAP 带来的影响，但是计算代价又太高了，偏离了作者 light-weight 的理念，而且全连接参数太多了也容易过拟合

2）Streaming Module

在这里插入图片描述

GAP 的 equal importance 不太适合 face tasks，FC 虽然能保证 different units 有 different importance，但是太复杂！

作者折衷一下，用步长大于 1 的 DWConv（Depth-wise convolution）降低分辨率（7×7×C -> 4×4×64，每个位置通过卷积权重加权了一下），然后 flatten 成特征 vector！

保证 not equal importance 的同时，也没有使得计算量过大

Streaming Module 模块是借鉴【GDConv】《MobileFaceNets：Efficient CNNs for Accurate RealTime Face Verification on Mobile Devices》的！

区别在于， GDConv（Global Depthwise Convolution）的 kernel size 与 input feature 的 size 相同

3）Network Architecture Detail

Table 1 and Figure 3

在这里插入图片描述

FeatherNet A 和 FeatherNet B 的差别在于，FeatherNet B 中的 Block B 换成 Block C 就是 FeatherNet A 了

注意，每个 stage 结束后还插入了 SE attention（【SENet】《Squeeze-and-Excitation Networks》）

loss 采用的是 focal loss（【Focal Loss】《Focal Loss for Dense Object Detection》）

4.2 Multi-Modal Fusion Method

在这里插入图片描述

stage 1：多个模型 ensemble 一起来分类（including FeatherNet），输入是 depth map，接近 1 就是 real，接近 0 为 fake，模棱两可的送进 stage 2

stage2：用 FeatherNetB 对 IR 图像分类，来进一步决策 stage1 中模棱两可的情况

整体逻辑如下，仅打比赛的时候使用，后续实验描述中仅用到了 depth image

在这里插入图片描述

5 Experiments

5.1 Datasets

1）CASIA-SURF（【CASIA-SURF】《A Dataset and Benchmark for Large-scale Multi-modal Face Anti-spoofing》）

在这里插入图片描述

评价指标，同【CASIA-SURF】

APCER：attack presentation classification error rate（Fake samples 的错误率）

BPCER：bona fide presentation classification error rate（real samples 的错误率）

ACER：average classification error rate（APCER 和 BPCER 的平均值）

HTER：half total error rate （真假人脸中各自被判断错的比例之和的一半，同 ACER）

ROC：receiver operating characteristic（纵坐标 TPR，横坐标 FPR）

2）MMFD dataset

作者自己采集的，15 subjects with 15415 real samples and 28438 fake samples

也是三种模态，除了 CASIA-SURF 的 6 种攻击外，作者都采集了两种新的攻击

Attack A： flat face photo + eyes and mouth cut
Attack B： curved face photo + eyes and mouth cut

5.2 Implementation Detail

不同装置采集的 depth 图片还是有区别的，如图 6

在这里插入图片描述

第一行是 CASIA-SURF 中 real face 的深度图，肉眼很难分辨

第二行是 MMFD 中 real face 的深度图，非常清晰

作者为了减小这种影响，对所有 real face 进行了 MMFD scale 操作，算法流程如 Algorithm 1

在这里插入图片描述

scale 后效果如图 6 中第三行所示

5.3 Result Analysis

1）How useful is MMFD dataset

在这里插入图片描述

用 MMFD 数据集的 depth 训练，在 CASIA-SURF 上测试效果也还行，比 baseline 三模态输入还高，有点意思！

2）Compare with other network performance

在这里插入图片描述

仅用 depth 作为输入

3）Ablation experiments

在这里插入图片描述

Why AP-down in BlockB（Fig.3 （b））
Model1 vs Model2，用了效果会好一些
Why not use FC layer
Model1 vs Model3，采用作者的 streaming module 后，加 fc 效果变差
Why not use GAP layer
Model3 vs Model4，streaming module 比 GAP 好

5.4 Competition details

上述实验都只是用到了数据集中的 depth 图，还未涉及到 figure 4 中的 depth + IR 形成的 ensemble + cascade 策略

6 Conclusion（own）

用的小 tricks 太多了吧，显得很繁琐，更偏向于技术报告，DWConv 的公式有必要绕来绕去写的那么晦涩难懂吗？
[31] suggested that increasing average pooling layer works well and impacts the computational cost little.（Block B 的设计指导思想）——《 Bag of tricks for image classification with convolutional neural networks》
a fast downsampling strategy[33] is used at the beginning of our network which makes the feature map size decrease rapidly and without much parameters——【FD-MobileNet】《FD-MobileNet：Improved MobileNet with a Fast Downsampling Strategy》

【FeatherNets】《FeatherNets：Convolutional Neural Networks as Light as Feather for Face Anti-spoofing》

文章目录

1 Background and Motivation

2 Related Work

3 Advantages / Contributions

4 Method

4.1 FeatherNet Architecture Design

4.2 Multi-Modal Fusion Method

5 Experiments

5.1 Datasets

5.2 Implementation Detail

5.3 Result Analysis

5.4 Competition details

6 Conclusion（own）

猜你喜欢