Table of Contents
I have summarized the papers I have seen recently on dynamic SLAM, although I did not understand most of them.
Method classification:
- learning-based:
- visibility-based: Removert, ERASOR
a. Incident angle problem: such as incident angle ambiguity.
b. Dynamic points cannot be correctly filtered out when obstacles appear. - Occupancy map-based: OctoMap (map), UFOMap (map), DUFOMap, Dynablox
a. Occupancy-based methods are computationally expensive in 3D environments.
The above three categories can also be called as follows: - apperance-based segmentation:
- scan-to-scan change detection:
- map-based change detection:
OctoMap:(2013)
● The octree structure allows for lazy initialization of the grid structure.
● Information can be stored in the octree at different resolutions.
Voxblox: (2016)
● SDF (signed distance field) represents the distance from each voxel point to the nearest obstacle, thereby speeding up trajectory optimization.
● Instead of allocating voxel sizes in advance, Voxblox allocates blocks of fixed size when needed.
UFOMap:(2020)
This article mainly targets the situation that there are too many unknown areas in the environment, so it introduces the mapping of unknown areas. Compared with OctoMap, it has improved the mapping performance, but it is not a dynamic SLAM. (It is mainly for the joint understanding with the following DUFOMap)
● Innovation:
○ Display of all states in the map: occupied, free, unknown.
○ Introducing different ways to fuse data into the octree to improve efficiency;
○ OctoMap does not allow iteration when inserting/deleting maps, and UFOMap overcomes this defect. Iteration is allowed.
○ The resolution parameter of the octree of the OctoMap function is immutable, while the resolution parameter of the octree of UFOMap can be changed.
● How is the occupancy stored: The occupancy is stored in the nodes corresponding to the octree (log-adds occupancy value is stored in the nodes).
● How is the occupancy of a node divided:
free, unknown occupied:
occ(n) < tf < occ(n) < to < occ(n)
● Morton codes are used to speed up the traversal of this tree. A spatial point encoding method.
● Point cloud measured by integrated sensors: Simple integrator, Discrete integrator, Fast Discrete integrator. I don’t understand the third method yet.
Offline methods: Removert, EARSOR, ERASOR2
Removert:(2020)
The main idea is to project the queried frame (scan) into the map (range images), then compare the range image of the query scan under visibility constraints, and use voting to predict dynamic points. At the same time, in order to reduce the wrong position, the range images are used to restore the dynamic points to static points in the degraded resolution scene, which is why it is called Removert (remove and rebuild).
ATTACK: (2021)
Premise: Assume that dynamic objects are associated with the ground. I don't quite understand this article yet. The idea of associating dynamic objects with ground points is also mentioned in the article A Dynamic. You can look at them together later.
Main idea: Compare the minimum and maximum ratios of the z-axis of the query scan and the map. If this ratio exceeds the threshold, then this area contains dynamic points, and then the bin of this area is removed. Disadvantages:
The minimum and maximum values of the z-axis range need to be specified separately for different scenarios. At the same time, for some areas that are too high, they may exceed the maximum range of the z-axis.
DynamicFilter:(2022)
scac-to-map front end and map-to-map back end Visibility-based and map-based methods
the front end:
After accumulating several frames, dynamic point removal is performed based on visualization methods: In the case of large incident angles or obstacles, static point removal may degenerate and incorrectly identify static points as dynamic points, so static point reconstruction is required after removal. This front-end static point reconstruction method is relatively new and can be learned from the paper.
the back end:
The article explains that since the system that only contains the front end is effective in removing short sequences, but performs poorly in removing long sequences, a back-end removal method is designed to compensate for this defect.
There is a large incident angle processing in the back-end module of this article. By calculating the incident angle of each point, the point cloud greater than the threshold is marked as a pseudo-occupied point. The nearest pseudo-occupied point is used as the visualization boundary, and any point beyond this visualization boundary is not considered. Because the ray projection of those points is incorrect. (This idea is very similar to another article)
Dynablox: (2023) Voxblox, the underlying mapping solution
- Main idea: Estimate high-confidence free-space regions incrementally by modeling each constraint through perception, state estimation, and mapping during online robotic manipulation: The core idea is to incrementally estimate reduced but high-confidence free-space regions by modeling each constraint. These high-confidence regions are then used to seed dynamic object clusters and disambiguate points in low-confidence regions.
- Innovation: The incremental high-confidence estimation method considers factors such as input point cloud, modeling sensor noise (focus on comparing this with the DUFOMap method below), measurement sparsity, dynamic environment, and state estimation drift (this can also be compared with the following method).
- Sources of error: sensor noise, state drift, inaccuracy in the map of unexplored space boundaries.
This article is really confusing, and many concepts are not very clear to me.
DUFOMap: (2024) Based on UFOMap
Relevance: The voxel structure generated after point cloud processing in UFPMap is used for point cloud operations in DUFOMap. Light projection is used to identify void regions (later used to distinguish static points from dynamic points).
There is a sentence in the article: The truncated signed distance field (TSDF) is an alternative to occupancy. Points whose TDSF values exceed the threshold are divided into "ever-free regions", and then points falling into this region are called dynamic points. The idea is similar to the occupancy-based approach.
● Innovation: How to identify dynamic points:
○ Based on the already identified void regions, if they are observed again. Sensor noise and positioning error are considered at the same time. How to consider pose difference and sensor noise by shrinking or expanding the range of void regions.
○ Comparison with the most advanced implementation methods, based on multiple scenarios and multiple sensor types.
● Identification of void regions: This article mentions:
○ Identify the probalilistic of each voxel based on occupancy grids, based on the observations of this region by all observers, that is, this region may switch between free and occupied.
○ In the method proposed in this article, observations based on a single-frame point cloud of radar are used, and a new extension of illumination projection is used to mark the void region. There is no need to accumulate observations before judging the void region, and a quick judgment can be made directly based on a single observation.
● Position and noise compensation method:
○ For position, a Chebyshev distance is used, and dp = 2 is set according to the first divided void region.
○ For noise compensation: a distance ds = 0.2m is set, and the voxel corresponding to the position 0.2 meters in front of the concentrated voxel is also called a hit. My understanding of this is not very accurate, and those who are interested in learning are recommended to refer to the original text.
BeautyMap:(2024)
- Represent point clouds as 3D binary grids and further encode them into 2D matrices to improve efficiency. Potential dynamic points are marked by comparing the current frame with the global point cloud map bit by bit.
- Dynamic point removal.
- Static point reconstruction: