Introduction to the CVPR2024 Autonomous Driving Challenge! Six items including end-to-end/world model/Occ/imageless perception

Click on the card below to follow the " Heart of Autonomous Driving " public account

ADAS giant volume of dry information is now available

>> Click to enter→ Heart of Autonomous Driving [Transformer] Technology Exchange Group

The public version of the CVPR2024 Autonomous Driving Challenge is now soliciting comments! This time it mainly involves six fields: visual language model, simulation, end-to-end, world model, Occupancy and Flow, and picture-less driving. If you have any ideas, you can submit them in the editor of the document below until January 15th.

For details, please scan the QR code in the picture or visit:

https://docs.google.com/document/d/1WCk1C2EngRyn4vK8djxSjNfryZx4Y7erqpxwg1HKisM/edit?usp=sharing

https://docs.qq.com/doc/DTkRUdGtvSE1WWUxQ

Just enter the document to comment. We look forward to your feedback.

ec67058aa55ac937da12e04bc8a4adb4.jpeg

1)Driving with Language

DriveLM combines language information to connect large language models (LLM) and autonomous driving systems, and ultimately introduces the reasoning capabilities of LLM to make decisions and ensure explainable planning. Given multi-view images as input, the model needs to answer questions involving various aspects of driving.

● Organizer
OpenDriveLab, University of Tubingen

● Data set
DriveLM-nuScenes, https://github.com/OpenDriveLab/DriveLM

● 输入
multi-view images from 6 cameras   questions in text

● 输出
answers in text (perception-prediction-planning)   planning behavior (behavior)

● Evaluation Metric

● VQA metrics
CIDEr (Consensus-based Image Description Evaluation)   GPT-Score

● Classification accuracy of multiple choice questions and planning behavior

2) CARLA Autonomous Driving Challenge

In order to verify the effectiveness of the AD system, we need a final planning framework with a closed-loop setup. CARLA AD leaderboards require agents to travel through a set of predefined routes. For each route, the agent will be initialized at the starting point and guided to drive to the destination, providing a route description via GPS coordinates, map coordinates, or route descriptions. Routes are defined in a variety of situations, including highways, urban areas, residential areas, and rural settings. The leaderboard evaluates AD agents in various weather conditions, including day, sunset, rain, fog, and night scenes.

● Organizer

CARLA Team, Intel/NVIDIA

● Dataset

CARLA Simulator, https://leaderboard.carla.org

● Input

data from GNSS, IMU, LiDAR, RADAR, RGB camera, Speedometer, (optional OpenDRIVE map)

● Output

Vehicle Controls (Steer, Throttle, Brake)

● Evaluation Metric

Driving score Route completion

● Infraction penalty

3)End-to-End Driving at Scale

Compared to the 2023 New Plan Challenge, where direct access to map information is prohibited for this track, the newly designed PDM simulation and scoring goes beyond the open-loop evaluation of the end-to-end autonomous driving framework.

● Organizer

University of Tuebingen, NVIDIA, University of Toronto

● Dataset

OpenScene, https://github.com/OpenDriveLab/OpenScene

● Input

multi-view images from 8 cameras, LiDAR, ego state, navigation command

● Output

future trajectory (8 seconds, 10Hz)

● Evaluation Metric

PDM score

4)Predictive World Model

A world model can predict future states based on current states. The world model learning process has the potential to take pre-trained base models to the next level. Given only visual input, the neural network outputs a point cloud in the future to demonstrate its predictive power about the world.

● Organizer

OpenDriveLab

● Dataset

OpenScene, https://github.com/OpenDriveLab/OpenScene

● Input

multi-view images from 8 cameras

● Output

LiDAR points within the next 3.0s, evaluated at 2Hz

● Evaluation Metric

Chamfer distance

5)Occupancy and Flow

The representation of 3D bounding boxes is not sufficient to describe general objects (obstacles). Instead, inspired by robotics concepts, we treat general object detection as an occupancy representation to cover more irregularly shaped objects (e.g., protruding objects). The goal of this task is to predict the 3D occupancy of a complete scene and the flow of foreground objects given input images from six cameras.

● Organizer

Motional

● Dataset

OpenOcc, https://github.com/OpenDriveLab/OccNet

● Input

multi-view images from 6 cameras

● Output

voxelized 3D space occupancy, with flow per foreground grid

● Evaluation Metric

Occupancy: mIoU without invalid grid. Invalid grid is defined as invisible grid in camera perspective view.

Occupancy Flow: L2 distance, considering foreground grids only.

Warped Occupancy (Main metric):

Use the occupied GT of the next timestamp to calculate mIoU

6)Mapless Driving

AD without high-definition maps requires a higher level of activity scene understanding, and this challenge aims to explore the boundaries of scene reasoning capabilities. The neural network takes multi-view images and standard definition (SD) maps as input, which can not only provide the perception results of lanes and traffic elements, but also simultaneously provide the topological relationships between lanes and between lanes and traffic elements.

● Organizer

OpenDriveLab

● Dataset

OpenLane-V2 subset-A, https://github.com/OpenDriveLab/OpenLane-V2

● Input

multi-view images from 7 cameras, optional SD map

● Output

vectorized lane segment, 2D traffic element, topology (lsls, lste)

other map elements

● Evaluation Metric

OpenLane-V2 Uni-Score

https://github.com/OpenDriveLab/OpenLane-V

The contributing author is a special guest of " Autonomous Driving Heart Knowledge Planet ", welcome to join the exchange!

① Exclusive video courses on the entire network

BEV perception , millimeter wave radar vision fusion , multi-sensor calibration , multi-sensor fusion , multi-modal 3D target detection , lane line detection , trajectory prediction , online high-precision map , world model , point cloud 3D target detection , target tracking , Occupancy, CUDA and TensorRT model deployment , large models and autonomous driving , Nerf , semantic segmentation , autonomous driving simulation, sensor deployment, decision planning, trajectory prediction and other learning videos ( scan the QR code to learn )

ca63b8bf40b430a3a69ad73170457923.png Video official website: www.zdjszx.com

② The first autonomous driving learning community in China

A communication community of nearly 2,400 people, involving 30+ autonomous driving technology stack learning routes. Want to know more about autonomous driving perception (2D detection, segmentation, 2D/3D lane lines, BEV perception, 3D target detection, Occupancy, multi-sensor fusion, Technical solutions in the fields of multi-sensor calibration, target tracking, optical flow estimation), autonomous driving positioning and mapping (SLAM, high-precision maps, local online maps), autonomous driving planning control/trajectory prediction, AI model deployment and implementation, industry trends, Job postings are posted. Welcome to scan the QR code below and join the Knowledge Planet of the Heart of Autonomous Driving. This is a truly informative place where you can communicate with industry leaders about various problems related to getting started, studying, working, and job-hopping, and share papers and code on a daily basis. +Video , looking forward to communication!

4053db555fd3e824325f425338125075.png

③【Heart of Autonomous Driving】Technical Exchange Group

The Heart of Autonomous Driving is the first autonomous driving developer community, focusing on target detection, semantic segmentation, panoramic segmentation, instance segmentation, key point detection, lane lines, target tracking, 3D target detection, BEV perception, multi-modal perception, Occupancy, Multi-sensor fusion, transformer, large model, point cloud processing, end-to-end autonomous driving, SLAM, optical flow estimation, depth estimation, trajectory prediction, high-precision map, NeRF, planning control, model deployment and implementation, autonomous driving simulation testing, products Managers, hardware configuration, AI job search exchanges , etc. Scan the QR code to add Autobot Assistant WeChat invitation to join the group, note: school/company + direction + nickname (quick way to join the group)

2a8b597739b28347adf8eea919182008.jpeg

④【Heart of Autonomous Driving】Platform Matrix, welcome to contact us!

4ea6602037792bac6c0d5be7f881c79a.jpeg

Guess you like

Origin blog.csdn.net/CV_Autobot/article/details/135435191