Knowledge Distillation and Student-Teacher Learning for Visual Intelligence

物联网 2023-08-22 17:58:03 阅读次数: 0

本文是蒸馏学习综述系列的第四篇文章，Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks的一个翻译。

视觉智能的知识蒸馏与学生-老师学习:回顾与新展望

摘要
1 引言
2 KD是什么？为什么要关注它？
3 KD的理论分析
4 基于教师数量的KD
- 4.1 从一个老师进行蒸馏
- - 4.1.1 来自logits的知识
  - 4.1.2 来自中间层的知识
- 4.2 从多个教师进行蒸馏
5 基于数据格式的蒸馏
- 5.1 无数据蒸馏
- 5.2 使用少量数据样本进行蒸馏
- 5.3 跨模态蒸馏
6 在线和无教师蒸馏
- 6.1 在线蒸馏
- 6.2 无教师蒸馏
7 标签需要或无标签蒸馏
- 7.1 标签需要的蒸馏
- - 7.1.1 原始标签的KD
  - 7.1.2 伪标签的KD
- 7.2 无标签蒸馏
- - 7.2.1 知识渊博的KD
  - 7.2.2 创造元知识
- 7.3 潜力与挑战
8 具有新学习度量的KD
- 8.1 通过对抗学习进行蒸馏
- 8.2 图表示的蒸馏
- - 8.2.1 符号与定义
  - 8.2.2 基于图的蒸馏
- 8.3 半监督、自监督学习的蒸馏
- 8.4 小样本学习
- - 8.4.1 挑战是什么？
- 8.5 增量学习
- 8.6 增强学习
9 视觉智能的应用
- 9.1 语义和运动分割
- 9.2 KD用于视觉检测与追踪
- 9.3 领域自适应
- - 9.3.1 半监督DA
  - 9.3.2 无监督DA
- 9.4 深度与场景流量检测
- 9.5 图像翻译
- 9.6 KD用于视频理解
- - 9.6.1 视频分类与识别
  - 9.6.2 视频标注
10 讨论
- 10.1 更大的模型就是更好的老师吗？
- 10.2 预训练老师的重要性？
- 10.3 再生自蒸馏会更好吗？
- 10.4 单教师与多教师
- 10.5 无数据蒸馏是否足够有效？
- 10.6 Logits与特征
- 10.7 KD的可解释性
- 10.8 网络结构与KD的有效性
11 新角度与观点
- 11.1 NAS的潜力
- 11.2 GNN的潜力
- 11.3 非欧蒸馏度量
- 11.4 更好特征表示
- 11.5 更具建设性的理论分析
- 11.6 特殊视觉问题的潜力
- 11.7 视觉，语音和NLP的集成
12 结论

摘要

1 引言

2 KD是什么？为什么要关注它？

3 KD的理论分析

4 基于教师数量的KD

4.1 从一个老师进行蒸馏

4.1.1 来自logits的知识

4.1.2 来自中间层的知识

4.2 从多个教师进行蒸馏

4.2.1 从logits集合中进行蒸馏

4.2.2 从特征集合中进行蒸馏

4.2.3 通过统一数据源进行蒸馏

4.2.4 从单教师到多个子教师

4.2.5 从异构的老师中定制学生

4.2.6 与同伴共同学习

5 基于数据格式的蒸馏

5.1 无数据蒸馏

5.1.1 基于元数据的蒸馏

5.1.2 基于类相似性的蒸馏

5.1.3 使用生成器进行蒸馏

5.1.4 无数据蒸馏面临的开放挑战

5.2 使用少量数据样本进行蒸馏

5.2.1 通过伪实例进行蒸馏

5.2.2 通过分层估计蒸馏

5.2.3 挑战和潜力

5.3 跨模态蒸馏

5.3.1 监督的跨模态蒸馏

5.3.2 无监督的跨模态蒸馏

5.3.3 从一个教师学习

5.3.4 从多个教师学习

5.3.5 潜力和公开挑战

6 在线和无教师蒸馏

6.1 在线蒸馏

6.1.1 个别学生同行

6.1.2 学生之间共享块

6.1.3 同学组合

6.1.4 总结和公开挑战

6.2 无教师蒸馏

6.2.1 再生蒸馏

6.2.2 通过深度监督蒸馏

6.2.3 基于数据增强蒸馏

6.2.4 框架改变的蒸馏

6.2.5 总结与公开挑战

7 标签需要或无标签蒸馏

7.1 标签需要的蒸馏

7.1.1 原始标签的KD

7.1.2 伪标签的KD

7.2 无标签蒸馏

7.2.1 知识渊博的KD

7.2.2 创造元知识

7.3 潜力与挑战

8 具有新学习度量的KD

8.1 通过对抗学习进行蒸馏

8.1.1 KD中的GAN的基本公式

8.1.2 GAN如何帮助KD？

8.1.3 总结和公开挑战

8.2 图表示的蒸馏

8.2.1 符号与定义

8.2.2 基于图的蒸馏

8.3 半监督、自监督学习的蒸馏

8.3.1 半监督学习

8.3.2 自监督学习

8.3.3 潜力与公开挑战

8.4 小样本学习

8.4.1 挑战是什么？

8.5 增量学习

8.5.1 从单教师蒸馏

8.5.2 从多教师蒸馏

8.5.3 公开挑战

8.6 增强学习

8.6.1 协作蒸馏

8.6.2 基于RL蒸馏的模型压缩

8.6.3 随机网络蒸馏

8.6.4 基于RL蒸馏的潜力

9 视觉智能的应用

9.1 语义和运动分割

9.2 KD用于视觉检测与追踪

9.2.1 通用目标检测

9.2.2 行人检测

9.2.3 人脸检测

9.2.4 车辆检测与驾驶学习

9.2.5 姿态检测

9.3 领域自适应

9.3.1 半监督DA

9.3.2 无监督DA

9.4 深度与场景流量检测

9.5 图像翻译

9.6 KD用于视频理解

9.6.1 视频分类与识别

9.6.2 视频标注

10 讨论

10.1 更大的模型就是更好的老师吗？

10.2 预训练老师的重要性？

10.3 再生自蒸馏会更好吗？

10.4 单教师与多教师

10.5 无数据蒸馏是否足够有效？

10.6 Logits与特征

10.7 KD的可解释性

10.8 网络结构与KD的有效性

11 新角度与观点

11.1 NAS的潜力

11.2 GNN的潜力

11.3 非欧蒸馏度量

11.4 更好特征表示

11.5 更具建设性的理论分析

11.6 特殊视觉问题的潜力

11.7 视觉，语音和NLP的集成

12 结论

猜你喜欢

转载自blog.csdn.net/c_cpp_csharp/article/details/130994202

Knowledge Distillation and Student-Teacher Learning for Visual Intelligence

Knowledge Distillation & Student-Teacher Learning for Visual Intelligence: A Review & New Outlooks

【Deep Learning】Sequence-Level Knowledge Distillation

【随记】Knowledge distillation in deep learning and its applications

Student Helping Teacher: Teacher Evolution via Self-Knowledge Distillation论文解读

【知识蒸馏】 Knowledge Distillation from A Stronger Teacher

【知识蒸馏】Knowledge Distillation with the Reused Teacher Classifier

Learning efficient object detection models with knowledge distillation论文笔记

论文讲解：Knowledge distillation: A good teacher is patient and consistent

知识蒸馏（Knowledge Distillation）

知识蒸馏Knowledge Distillation

Knowledge Distillation examples

BERT and Knowledge Distillation

【随记】Knowledge Distillation: A Survey

On the Efficacy of Knowledge Distillation 解析

Learning Visual Knowledge Memory Networks for Visual Question Answering论文解读

[论文解读]Explaining Knowledge Distillation by Quantifying the Knowledge

多老师知识蒸馏模型——Anomaly detection based on multi-teacher knowledge distillation

Residual Knowledge Distillation论文精度

Knowledge Distillation 知识蒸馏详解

Knowledge Distillation(KD) 知识蒸馏

论文解读：Decoupled Knowledge Distillation

知识蒸馏简介（Knowledge Distillation）

论文笔记 Learning Visual Knowledge Memory Networks for Visual Question Answering （CVPR2018)

【随记】KD and S-T Learning for Visual Intelligence: A Review and New Outlooks

【Distill 系列：三】On the Efficacy of Knowledge Distillation

Knowledge Distillation(KD) 知识蒸馏 Pytorch实现

【随记】The State Of Knowledge Distillation ForClassification Tasks

【KD】2022 CVPR Decoupled Knowledge Distillation

知识蒸馏是什么？（Knowledge Distillation）KD

今日推荐

基于大语言模型的开源知识库问答系统 MaxKB GitHub Star 数量突破 5,000 个！

美国拟限制 AI 大模型出口中国和俄罗斯

苹果将与 OpenAI 达成协议，将 ChatGPT 应用于 iPhone

openKylin 社区生态委员会第六次会议圆满召开

阿里云正式发布通义千问 2.5

Python 3.13 发布首个 Beta：实验性自由线程模式和 JIT、改进交互式解释器

Stack Overflow 拿我的代码去训练 AI 大模型，还封了我的账号

Pop!_OS 的 COSMIC 桌面完成 App Store 上架工作

报告：Django 仍然是 74% 开发者的首选

《2024 年一季度互联网投融资运行情况》研究报告

15 年前上了“FFmpeg 耻辱柱”，今天他还得谢谢咱——腾讯QQPlayer一雪前耻？

TIOBE 5 月榜单：Fortran “复活”进入 Top 10

周排行

BPM为企业带来的实际利益

好程序员web前端分享css常用属性缩写

Java文件下载（excel）

css样式的动态添加及显示和隐藏等零碎用法

axios全局配置以及拦截器

使用Logstash来实时同步MySQL和log日志数据到ES

C++获取当前时间（年月日、时分秒、毫秒）

Odoo产品分析 (四) -- 工具板块(11) -- 网站即时聊天(1)

Java环境配置正确，但是java、javac、java -version均返回“不是内部或外部命令，也不是可运行的程序或批处理文件”？

01 官网下载各种CentOS教程（超详细版）

每日归档

更多

2024-05-14(0)

2024-05-13(18)

2024-05-12(0)

2024-05-11(38)

2024-05-10(38)

2024-05-09(35)

2024-05-08(42)

2024-05-07(14)

2024-05-06(40)

2024-05-05(0)