When it comes to purely visual autonomous driving solutions, the first thing that comes to everyone’s mind is Tesla. Indeed, as early as 2021, Tesla has implemented a purely visual BEV detection solution, and the effect is very good.
![d4dccd0fa71339d21cfc723bc4c50ba0.png](https://img-blog.csdnimg.cn/img_convert/d4dccd0fa71339d21cfc723bc4c50ba0.png)
Careful students may have discovered that the core component in this BEV solution that converts images from the camera space to the BEV space is the Transformer.
Transformer comes from the field of natural language processing and was first applied to machine translation. Later, everyone discovered that it was also very effective in the field of computer vision, and it crushed the CNN network in major rankings.
![68c05c5aaec0d998659967b5b15ab75e.png](https://img-blog.csdnimg.cn/img_convert/68c05c5aaec0d998659967b5b15ab75e.png)
In the field of target detection, the visual Transformer can not only achieve 2D detection, 3D detection, but also multi-modal detection. The performance of detection from the BEV perspective is also very good.
![ccd2e9cc5841f2bfff09c3630867ad73.png](https://img-blog.csdnimg.cn/img_convert/ccd2e9cc5841f2bfff09c3630867ad73.png)
Therefore, mastering Transformer-related knowledge and engineering foundations has become a skill requirement for companies recruiting algorithm engineers, and it is also a big plus point on a resume.
However, there are three difficulties in mastering the Transformer-based target detection algorithm :
Understand the theoretical basis behind Transformer, such as self-attention mechanism (self-attention), positional embedding (positional embedding), object query, etc. The information on the Internet is relatively messy and not systematic enough, making it difficult to achieve in-depth understanding through self-study. And integrate it.
Master the ideas and innovations of the Transformer-based target detection algorithm. Some Transformer papers involve many new concepts, and the language is not so easy to understand. After reading the paper, you still do not understand the details of the algorithm.
![246868e916e8e70e1f182da1e402491c.png](https://img-blog.csdnimg.cn/img_convert/246868e916e8e70e1f182da1e402491c.png)
The Transformer code is not easy to understand because the mechanism of action is quite different from that of CNN, so it takes a lot of effort to fully understand the code and apply it in practice.
![c2c0e45955c3f466c4d4b87697b109fc.png](https://img-blog.csdnimg.cn/img_convert/c2c0e45955c3f466c4d4b87697b109fc.png)
So how to learn the target detection algorithm based on Tansformer?
The co-lecturer of the 3D Vision Workshop "Yu Yan" carefully prepared the course " Visual Transformer in Target Detection " for everyone, mainly to help students solve the above difficulties.
It not only explains the basic knowledge of visual Transformer and various classic Transformer-based target detection algorithms in detail, but also provides code interpretation and practical courses, so that everyone can truly learn and apply, understand and master these knowledge theories.
Practical part
![dd65b2817ec6f1e9136bbe1b84bff29a.png](https://img-blog.csdnimg.cn/img_convert/dd65b2817ec6f1e9136bbe1b84bff29a.png)
![0550314c2e3f14d54aadb3b684c3e018.png](https://img-blog.csdnimg.cn/img_convert/0550314c2e3f14d54aadb3b684c3e018.png)
![50280d48c392fcbd47a7e39b1013543c.jpeg](https://img-blog.csdnimg.cn/img_convert/50280d48c392fcbd47a7e39b1013543c.jpeg)
![14574cd4e03284505f8e0e56002d2606.jpeg](https://img-blog.csdnimg.cn/img_convert/14574cd4e03284505f8e0e56002d2606.jpeg)
![109c1da397c4190ffb1f3b19e8f6b95a.jpeg](https://img-blog.csdnimg.cn/img_convert/109c1da397c4190ffb1f3b19e8f6b95a.jpeg)
![80aa11113d22bb0332fbd88bef1fef77.jpeg](https://img-blog.csdnimg.cn/img_convert/80aa11113d22bb0332fbd88bef1fef77.jpeg)
![7963577138bd4f97c2c429917f2b8bf7.png](https://img-blog.csdnimg.cn/img_convert/7963577138bd4f97c2c429917f2b8bf7.png)
![924b0722a4da5eb655e035343f8cd367.png](https://img-blog.csdnimg.cn/img_convert/924b0722a4da5eb655e035343f8cd367.png)
![337dfdaf97537d35cf40bf24e4e1c150.png](https://img-blog.csdnimg.cn/img_convert/337dfdaf97537d35cf40bf24e4e1c150.png)
![f85eb2b6dc88da05d0247af1cc2b2370.jpeg](https://img-blog.csdnimg.cn/img_convert/f85eb2b6dc88da05d0247af1cc2b2370.jpeg)
![e25b167e9dc2959ccd7219d5ad6b05dc.jpeg](https://img-blog.csdnimg.cn/img_convert/e25b167e9dc2959ccd7219d5ad6b05dc.jpeg)
![5044b62babbaa52671dc9673beadcdb1.jpeg](https://img-blog.csdnimg.cn/img_convert/5044b62babbaa52671dc9673beadcdb1.jpeg)
Class start time
At 8pm on July 28, 2023 (Friday), one chapter will be updated every week.
Course Q&A
Questions and answers for this course are mainly answered in the Goose Circle corresponding to this course. If students have any questions during the learning process, they can ask them in the Goose Circle at any time.
![27c44dca3982b692c79710a5caf90368.png](https://img-blog.csdnimg.cn/img_convert/27c44dca3982b692c79710a5caf90368.png)
![f5321f706709dd8043eb00af608a7bae.jpeg](https://img-blog.csdnimg.cn/img_convert/f5321f706709dd8043eb00af608a7bae.jpeg)