"Paper Reading 19" Multisource forest point cloud registration with semantic-guided keypoints and robust RANSAC

   1. Paper

  • Research areas: Point cloud registration  
  • 论文:Multisource forest point cloud registration with semantic-guided keypoints
    and robust RANSAC mechanisms 
  • International Journal of Applied Earth Observations and Geoinformation

  • Received 29 August 2022, Revised 20 October 2022, Accepted 9 November 2022, Available online 15 November 2022, Version of Record 15 November 2022.

  • Paper link
  • The team’s public data set WHU-TLS

2. Overview of the paper

This study proposes a unified and tag-free framework for TLS-TLS and ULS-TLS registration in forest scenarios .

  • This framework does not rely on single tree segmentation and ground filtering, improving the robustness and reliability of forest registration. It supports registration in plantations with regular layout and limited overlap, thereby increasing the practicality and efficiency of field data collection.
  • A novel semantically guided forest point cloud registration of key points is proposed. The proposed key points maintain reliable repeatability between different views. They are no longer limited to the ground level but are distributed throughout the space, which improves the reliability and quality of the input feature points used for registration.

First, semantically guided keypoints are detected from TLS and ULS point clouds by leveraging the Wood Response Index (WRI), motivated by wood leaf separation . Utilizes the Wood Response Index (WRI), which represents reliable repeatability between different views of forest point clouds. Reliable repeatability between forest point clouds.

Second, an initial correspondence set is generated by using WRI filters and Binary Shape Context (BSC) descriptor matching .

Finally, the initial correspondence set is pruned and optimized through a powerful RANSAC mechanism, including geometric compatibility filters to guide and prune the hypothesis generation process and a modified hypothesis evaluation step to further optimize the transformation.

3. Detailed description of the paper

  Multi-source forest point cloud registration based on semantic guided key points and robust RANSAC mechanism

  • abstract  

  The increasing availability of terrestrial laser scanning (TLS) and unmanned aerial vehicle laser scanning (ULS) facilitates accurate and detailed forest structure measurements. Registering multi-view observations from different platforms is a prerequisite for a comprehensive understanding of forest structure. Currently, forest point cloud registration is usually through single-tree attribute-based (e.g., tree location, trunk diameter) methods, which suffer from various complex forest compositions and terrains, and may not be feasible for forests with regular tree layouts and insufficient common trees. reliable. Therefore, this study proposes a unified and marker-free framework for TLS-TLS and ULS-TLS point cloud registration in forested areas.

First, semantically guided keypoints are detected from TLS and ULS point clouds by exploiting the Wood Response Index (WRI), motivated by wood-leaf separation. Utilizes the Wood Response Index (WRI), which represents reliable reproducibility between different views of forest point clouds. Reliable repeatability between forest point clouds.

Second, an initial correspondence set is generated by using WRI filters and Binary Shape Context (BSC) descriptor matching .

Finally, the initial correspondence set is pruned and optimized through a powerful RANSAC mechanism, including geometric compatibility filters to guide and prune the hypothesis generation process and a modified hypothesis evaluation step to further optimize the transformation. The experiment was conducted on six plots with different densities and tree species in Guangxi Forest Farm. The resulting average registration residuals and running times are 0.049m and 93 s for the TLS-TLS scenario, and 0.299m and 242 s for the ULS-TLS scenario, respectively. Comprehensive comparison shows that the proposed method outperforms other baselines. The results show that this framework can improve the practicality and efficiency of multi-source data collection and registration, thereby promoting the application of TLS and ULS in forest ecosystem science.

Methods based on single tree attributes (e.g., tree position, trunk diameter) suffer from a variety of complex forest compositions and topography, and may be unreliable for forests with regular tree layouts and insufficient common trees.

  • Introduction

  Terrestrial and unmanned aerial vehicle (UAV) laser scanning systems are valuable technologies for assessing forest structure and supporting a wide range of forest applications, including forest inventories (Wallace et al., 2012; Liang et al., 2018), sustainable Forest management (Camarretta et al. 2020; Zhang et al. 2022; Xue et al. 2022) and carbon accounting (Rosenqvist et al. 2003; Brede et al. 2022). Over the past two decades, UAV laser scanning (ULS) and terrestrial laser scanning mapping (TLS) have rapidly developed in forest applications from scientific testing to operational applications (Nelson, 2013). However, one of the main disadvantages of ULS and TLS is that they cannot obtain complete structural information about the forest due to occlusion, which is considered to limit the potential of these techniques in forest applications (Tremblay and B'eland, 2018; Polewski et al . , 2019). ULS systems are efficient and flexible in mapping forest canopies, but capture insufficient points in the understory. The TLS system can provide supplementary observations of the understory in the vertical direction. However, a single TLS scan also suffers from occlusion in the horizontal direction . These occlusions and incomplete structures can be problematic for forest structure understanding (Korpela et al., 2012; Wang et al., 2019a) and hinder the application of these techniques over large areas. Therefore, registration of multi-source point clouds, including TLS-TLS registration from different ground views and ULS-TLS registration from ground-aerial views, is a prerequisite for accurate understanding of horizontal and vertical forest configurations.

  Manual marking provides a solution for forest area registration. However, manual marker placement and detection severely hinders the practicality and efficiency of these methods. Therefore, a label-free automatic registration method for multi-source point clouds is needed. Extensive research has designed invariant features and matching strategies for specific tasks in robotics/mobile perception, medical imaging, and industrial applications. However, due to the monotonous semantic information of forest ecosystems (mostly trees) and the irregular structure of natural elements, there are relatively few automatic registration methods developed for forest ecosystems. Typical techniques for automatic registration of forest scenes are based on features of single tree attributes, such as tree position and trunk diameter. These techniques rely on accurate tree segmentation and ground filtering, which are non-trivial tasks in forest scenes. Their performance degrades in complex and changing forest and terrain conditions. For plantations with regular tree arrangements and similar tree attributes, even when the obtained tree attributes are accurate, the developed tree features may not be sufficient for registration. It is worth noting that at least three common trees are theoretically required to perform 3D registration . A distance of less than 15 m between two TLS scans during live events is recommended to improve the reliability of the registration (Tremblay and B´eland, 2018; Guan et al., 2020a). For dense and complex forest environments, smaller distances are required. These limitations reduce the practicality and efficiency of field data collection. Some other types of feature primitives have also been proposed for forest registration, such as pattern points (Dai et al., December 28, 2009). 2019) and visual occlusion points (Guan et al., 2020a).

Typical techniques for automatic registration of forest scenes are based on features of single tree attributes, such as tree location and trunk diameter. These techniques rely on accurate tree segmentation and ground filtering, which are not trivial tasks in forest scenes. Their performance degrades with complex and varied forest and terrain conditions.

  However, they are also subject to tricky forest environments and require sufficient overlap. Pattern-based methods require multi-scan TLS to participate in ULS-TLS registration to increase the success rate (Dai et al. 2019). Registration of single-scan TLS and ULS can be problematic due to low overlap in the vertical direction. Furthermore, the above-mentioned feature points designed for forest registration are mainly limited to the ground level, which means that the quality control of the registration only focuses on the near-ground subspace and ignores the upper canopy region . For the matching strategy, Random Sample Consistency (RAN SAC) is used as a practical solution for rigid transformation estimation. In forest registration, RANSAC is often combined with tree attributes or descriptors derived from tree attributes to prune pseudo-correspondences (Kelbe et al. 2009, USA). 2016; Tremblay and B'eland, 2018; Guan et al. 2020b). However, the results can be inaccurate and they can become computationally intensive and ineffective as the number of detected trees increases (Tremblay and B'eland, 2018).

Based on the above data attributes and registration issues, this study proposes a unified and label-free framework for multi-source forest point cloud registration. The main contributions are as follows:

  • We propose a unified framework for TLS-TLS and ULS-TLS registration in forest scenarios. This framework does not rely on single tree segmentation and ground filtering, improving the robustness and reliability of forest registration. It supports registration in plantations with regular layout and limited overlap, thereby increasing the practicality and efficiency of field data collection.
  • We propose a novel semantic-guided keypoint registration for forest point clouds. The proposed key points maintain reliable repeatability between different views. They are no longer limited to the ground level but are distributed throughout the space, which improves the reliability and quality of the input feature points used for registration.
  • We introduce a powerful RANSAC mechanism to obtain accurate transformation results efficiently. In the RANSAC hypothesis generation process, the registration efficiency is improved through a geometric compatibility filtering step. The transformation accuracy is further improved by using an improved RANSAC metric in hypothesis evaluation, which takes into account both the impact of the number of inliers and the accuracy.

The remainder of this paper is organized as follows. Section II briefly reviews some related techniques for registration. Section 3 presents the study centre, the datasets collected and operational details during the field campaign. Section 4 describes the proposed registration framework and evaluation criteria. Experimental results and discussion are provided in Section 5. The sixth part is conclusion and outlook.

  • Related work

In this section, we briefly review some techniques highly relevant to rigid 3D point cloud registration , including 3D keypoint matching, correspondence pruning and optimization methods, and deep learning-based registration methods. Involves related techniques applied in forest and urban scenarios.

3D keypoint matching

Many point cloud registration methods start with 3D keypoint matching to generate an initial correspondence set . 3D keypoint matching usually consists of three steps: 3D keypoint detection, description and matching. Keypoint detectors can reduce the input point cloud to a smaller number of keypoints and improve registration efficiency. Some researchers have attempted to develop unique key points that are highly reproducible. Representative key points include SIFT (Flitton et al. 2010), Local Surface Patch (LSP) (Chen and Bhanu 2007), Intrinsic Shape Features (ISS) (Zhong 2009), MeshDoG (Zahescu et al. 2009), KeyPoint Quality (KPQ) (Mian et al., 2010), Harris3D (Sipiran and Bustos, 2011), Histogram of Normal Orientation (HoNO) (Prakhya et al., 2016), etc. Compared with 3D keypoint detectors developed for urban point clouds and 3D models, there are relatively few 3D keypoints designed for forest scenes . Most forest point cloud registration methods adopt tree locations(Hauglin et al., 2008). 2014; Kelbe et al., 2016; Kukko et al., 2017; Tremblay and B'eland, 2018; Polewski et al., 2019; Guan et al., 2020b) or stalk profile (Liu et al., 2017) as registered Raw data. However, tree position based methods have some limitations. First, the accuracy and repeatability of tree locations rely on the performance of tree segmentation and ground filtering, which is not always easy to handle in forest environments. Second, when the tree top and trunk are not on the same vertical line, the tree position derived from ULS (crown) and TLS (trunk) may have bias. Third, in plantations, tree matching can be ambiguous and unreliable due to regular placement and similar tree properties. Successful registration requires adequate ground sampling and at least three common trees. Some researchers suggest that the scan center should be closer to 15 m to improve the reliability of TLS-TLS registration (Tremblay and B´eland, 2018; Guan et al. 2020a). For dense forests and complex terrain, smaller distances may be required. These may reduce the usefulness and efficiency of field data collection. Finally, tree positions are largely restricted to ground level, such that registration has low errors near the ground but large errors in the upper canopy (Kelbe et al., 2008). 2016).

In addition to tree positions and trunk curves, some other types of keypoints have been proposed for registration in forest point clouds. Dai et al. (2019) minimized the difference in probability density distributions of ULS and TLS and extracted pattern-based key points through the mean shift algorithm. Pattern-based keypoints require sufficient overlap in the crown to ensure stability. .Guan et al. (2020a) extracted visual occlusion points for TLS registration, which are the starting points of the radial occlusion region and are a subset of tree locations. Visually occluded points are not feasible for ULS and require accurate ground point identification. These keypoints are also affected by the ground filtering properties, and their z-coordinates are also constrained to the ground level.

The descriptor encodes the local surface around the keypoint into a feature vector such that the keypoint can be matched . Representative descriptors include Fast Point Feature Histogram (FPFH) (Rusu et al., 2009), ISS descriptor (Zhong, 2009), Rotated Projection Statistics (RoPS) (Guo et al., 2013), Orientation Histogram (SHOT) signatures of (Tombari et al., 2010) and binary shape context (BSC) (Dong et al., 2017). The feature matching step establishes corresponding relationships through descriptor matching techniques. In forest scenarios, individual tree attributes including diameter at breast height (DBH), tree height, distance between tree locations or features developed from these tree attributes are often used as matching descriptors (Kelbe et al., 2008). 2016; Tremblay and B'eland, 2018; Polewski et al., 2019; Guan et al., 2020b).

Corresponding pruning and optimization methods

3D keypoint matching is known to contain outliers , which can be caused by repeatable patterns, limited overlap, noise, or density variations. In real point cloud scenes, the outlier rate of initial correspondences is usually higher than 95% (Bustos and Chin, 2017). Therefore, robust correspondence pruning and optimization methods are needed. Perhaps the most popular correspondence pruning and optimization method is based on RANSAC (Fischler and Bolles, 1981). It follows a "hypothesis generation and validation" scheme that iteratively randomly samples a minimal sample set to generate hypotheses and evaluates the correctness of each hypothesis based on the number of correspondences that agree with each other. Some variants of RANSAC have been proposed to further improve the performance of registration. Yang et al. (2016) developed an optimized sample consistency (OSAC) algorithm to optimize the correspondence. Euclidean distance between Local Feature Statistical Histogram (LFSH) descriptors is used to first remove major outliers. Then an error metric based on point-to-surface distance is introduced to optimize the transformation. Quan et al. (2018) first ranked all initial correspondences based on similarity of local voxelized structure (LoVS) descriptors. Then, a globally constrained 1-point-based sample consistency (GC1SAC) algorithm is used to estimate the transformed keypoints using the local reference frame (LRF) paired with the number of overlapping points as the criterion. These RANSAC methods with modified hypothesis generation and evaluation procedures have been explored on urban point cloud and simulation model data, however, to the best of our knowledge, how variants of RANSAC will perform on forest datasets from different platforms (including different ground views and ground-level aerial views) of the highly irregular and unconstructed natural elements remain unclear.

In forest scenes, the classic RANSAC algorithm is usually used to prune and optimize tree location correspondences, combined with tree location attributes (including diameter at breast height, tree height, distance between tree locations, or variations thereof). Kelbe et al. (2016) first populate triples with stems of DBH and pruned pseudo-pairs by DBH similarity. The filtered pairs are then stored using intrinsic geometric similarity, and RANSAC uses the number of matching tie points to evaluate the TLS registration of each pair. Tremblay and B'eland. (2018) modified the algorithm of Kelbe et al. (2016) and present a parallelized version by comparing lengths in triangles. Guan et al. (2020b) converted the spatial pattern of tree distribution into a triangular irregular network (TIN) to account for area and angle criteria in matching voting. The matched tree pairs are then filtered and optimized using the classic RANSAC algorithm. Although these methods achieve satisfactory results on test datasets, their computation time and storage requirements grow rapidly with the number of trees (or join points). Tremblay and B'eland (2018) reported that when the number of detected trees exceeds 50, the execution time and memory usage become very large. In classic RANSAC schemes, their results may be less accurate because only the number of inliers is considered, and inliers are treated equally. Polewski et al. (2019) encode feature vectors based on the mutual distance between trees and embed the similarity in a full weighted bipartite graph for maximum weight matching. Dai et al. (2019) treat the registration of ULS and TLS as probability density estimation of pattern points. The source and target pattern points are considered as the centroids of a Gaussian Mixture Model (GMM) and the data points generated by the GMM. The Coherent Point Drift (CPD) algorithm is used to iteratively fit GMM centroids to align with the data points. However, it requires sufficient overlap between two pattern points, and the performance is affected by the outlier ratio, which is usually not known in advance and cannot be determined automatically or analytically.

Registration method based on deep learning

The development and progress of deep learning on 3D point clouds (e.g., PointNet (Qi et al., 2017) and DGCNN (Wang et al., 2019b)) provide opportunities for learning-based registration methods. One of the earliest works is PointNetLK (Aoki et al., 2019), which embeds point clouds in a high-dimensional feature space through PointNet and iteratively aligns feature representations through the IC-LK algorithm (Baker and Matthews, 2004). PCRNet (Sarode et al., 2019) extends PointNetLK by replacing the IC-LK algorithm with a deep neural network. Wang and Solomon (2019c) proposed a DCP registration network that computes DGCNN features and adopts the Horn method for matching in end-to-end mode. Although these methods have reported good results on test data, they can be problematic for part-to-part point cloud registration (Fu et al., 2008, 20018, 2008, 20019, 2019, 2021 ). Wang and Solomon (2019d) proposed a PRNet that extends DCP and attempts to solve partial registration in a self-supervised manner. Wang et al. (2021) proposed the first registration network for large-scale outdoor TLS point clouds, JoKDNet, where keypoint detection and description are jointly learned to register point clouds. However, most of these methods train networks in a supervised manner, which limits their application on real-world unlabeled data (e.g., Dong, 2020; Deng et al., 2021). And these deep learning-based registration methods still struggle to generate acceptable interior point rates in real scenes (Yang et al., 2020).

  • Research Center and Data Collection

Study areas and plots

Experiments were conducted at four forest locations in Guangxi, China, including Guigang (23°7′N, 109°28′E), Qinzhou (22°2′N, 108°34′E), Laibin (23°6′N, 109°47′E) and Guilin (24°59′N, 110°35′E). ① people. Guigang Forest Park is a plantation dominated by Masson pine, with little understory and flat terrain. Trees are planted in a regular pattern with approximately 7 meters between trees. The average tree height is 18 meters, the diameter at breast height is 44 cm, and the trunk density is 133 trees/hectare. The Qinzhou forest is a natural young forest composed of Eucalyptus urophylla, Pinus massoniana and Acacia leucophylla, with a density of 600 plants/hm 2 . The average tree height is 12 m and the diameter at breast height is 19 cm. The dominant tree species in Laibin forest area are Masson pine and Eucalyptus stout, with a trunk density of 1311 trees/hm 2 . The average tree height is 15 m and the diameter at breast height is 19 cm. Guilin Forest Park is dominated by fir trees, with a tree trunk density of 755 trees/hm2. The average tree height is 12 m, and the average diameter at breast height is 18 cm. There are two sample plots in Guigang Forest and Qinzhou Forest, and one sample plot in Guigang and Laibin Forest (Table 1).

TLS data

TLS data were collected in May 2020 using a RIEGL VZ-400 laser scanning system, which provides a 360° horizontal field of view and a 100° vertical field of view (from − 40° to 60°) with angular scanning resolution is 0. 03°. The scanning frequency is 120 kHz. It has a maximum measuring range of 600 m and a measurement accuracy of 2 mm at 100 m. Taking into account the different stem densities of the test graphs, different scan intervals (distance between adjacent scan centers) were used to achieve full coverage of the graph. At least two scans were set up in each plot, with a minimum scan of 14 m and a maximum scan of 53 m (Table 2).

ULS data

This study used the Genius UAV lidar developed by Beijing SureStar Company. Use DJI M200 as the platform in the system. The RFans-16 laser scanner, advanced navigation space dual-coupled GNSS and IMU sensors are installed on the UAV platform. The RFans-16 scanner has a horizontal field of view of 360° and a vertical field of view of 30°. The measuring range is 200 m. Its operating wavelength is 905 nm and its scanning frequency is 320 kHz. The nominal distance error of the UAV lidar system is less than 0.15m. In June 2020, three sets of ULS point clouds were obtained over Guigang, Laibin and Qinzhou. The flight altitude is 40 m above the ground and the speed is 5 m/s. In this study, ULS-TLS registration is performed using ULS and TLS data on three graphs.

  • Method

Overview

The proposed framework provides a general solution to automatically register near-ground multi-source point clouds in forest scenes, including TLS-TLS registration (different ground views) and ULS-TLS registration (ground aerial views). The framework consists of three main steps:

1) Detect semantically guided key points from forest ULS and TLS point clouds;

2) Generate an initial correspondence set by WRI filtering and BSC matching;

3) Eliminate outliers from the initial correspondence set and obtain accurate conversions through the robust RANSAC mechanism.

The overall workflow of registering TLS-TLS/ULS-TLS forest point cloud is shown in the figure. 2

WRI key point detection

Inspired by the Konoha separation work of Wan et al . (2021), we propose a novel keypoint for forest TLS-TLS and ULS-TLS point cloud registration. A good 3D keypoint detector is expected to have two important properties:

1) The detected key points should have reliable repeatability between different viewing angles in the same forest area;

2) The local surface of the detected key points should have sufficient descriptive information so that these points can be uniquely characterized (Tombari et al., 2013; Prakhya et al., 2016). The WRI keypoint detector consists of two steps: pruning step and non-minimum suppression (NMS) step . The first step is to prune the input data by saliency measures computed at each point. Inspired by Wan et al.’s Konoha indicator in TLS point clouds. (2021), we propose a new metric, WRI, which shares similar characteristics and responses of wood components in TLS and ULS point clouds. The significance measure is given by WRI:

Where L, P and S refer to linear, planar and scattering features respectively . Timber components are more likely to exhibit stable and similar geometric characteristics across different views and platforms than foliage and ground components. WRI is designed to represent similar responses and distributions on TLS and ULS point clouds. The wood component has a higher ratio than other components (e.g., leaves and ground) in both TLS and ULS. Figure 3 shows two examples of WRI distribution in TLS and ULS point clouds.

The main principle of the WRI-based saliency measure in the pruning step is to detect salient points of wood components with similar confidence levels. Specifically,

  • We first sort all points on WRI in ascending order .
  • Then the top nW ranked points, i.e., those points that are more likely to have similar wood confidence from the input point cloud are selected as keypoint candidates and passed to the next stage.

To avoid having too many keypoints close together, we check two conditions in the NMS step to mark them as keypoints or not.

  • First Condition It should have the smallest WRI in its neighborhood.
  • The second condition, checks if it has the strongest curvature in its neighbourhood.

If either of these two conditions is met, the point is marked as a key point. A summary of the steps for WRI key point detection from forest point clouds is given in Algorithm 1. 

A robust outlier pruning RANSAC mechanism

Since pairwise TLS or ULS-TLS datasets in forest scenes are often subject to partial overlap, variable density, and repetitive structures (e.g., similar wood or leaf shapes among different trees), the initial correspondence set needs to be pruned and optimized to obtain accurate transformation . In this study, we employ a two-step outlier removal strategy in RANSAC: a geometric compatibility filter in the hypothesis generation step and a modification measure in the hypothesis evaluation step. In order to reduce the number of hypotheses generated in RANSAC and improve efficiency, in the first step, the corresponding candidates in Cinit are filtered by geometric compatibility.

Guess you like

Origin blog.csdn.net/peng_258/article/details/132571460