[Point Cloud Processing Tutorial] 03 Using Python to Realize Ground Detection

1. Description

        This is the 3rd post of my "Point Cloud Processing" tutorial. The "Point Cloud Processing" tutorial is beginner-friendly, in which we briefly introduce the point cloud processing pipeline from data preparation to data segmentation and classification.

        In the previous tutorial , we computed a point cloud from depth data without using the Open3D library. In this tutorial, we will first describe the system coordinates. We will then carefully analyze the point cloud using ground detection as an example. We'll also cover organized point clouds, an interesting 3D representation.

author's camera field of view

        This is the 3rd post of my "Point Cloud Processing" tutorial. The "Point Cloud Processing" tutorial is beginner-friendly, in which we briefly introduce the point cloud processing pipeline from data preparation to data segmentation and classification.

[Point Cloud Processing Tutorial] Introduction to Open3D of 00 Computer Vision

[Point Cloud Processing Tutorial] 01 How to Create and Visualize Point Cloud 

[Point Cloud Processing Tutorial] 02 Estimating Point Clouds from Depth Images in Python 

[Point Cloud Processing Tutorial] 03 Using Python to Realize Ground Detection 

 [Point Cloud Processing Tutorial] 04 Point Cloud Filtering in Python

[Point Cloud Processing Tutorial] Point Cloud Segmentation in 05-Python 

 

2. Computer Vision Coordinate System

        Before getting started, it's important to understand traditional coordinate systems in computer vision. They are closely followed in Open3D [1] and Microsoft Kinect sensor [2]. In computer vision, images are represented in an independent 2D coordinate system, where the X-axis points from left to right and the Y-axis points from top to bottom. For a camera, the 3D coordinate system origin is at the focal point of the camera, with the X axis pointing right, the Y axis pointing down, and the Z axis pointing forward.

        Computer Vision Coordinate System

        We start by importing the required libraries:

import numpy as np
import open3d as o3d

        For better understanding, let's import point cloud from PLY file, use Open3D to create default 3D coordinate system and display them:

# Read point cloud:
pcd = o3d.io.read_point_cloud("data/depth_2_pcd.ply")
# Create a 3D coordinate system:
origin = o3d.geometry.TriangleMesh.create_coordinate_frame(size=0.5)
# geometries to draw:
geometries = [pcd, origin]
# Visualize:
o3d.visualization.draw_geometries(geometries)

        The displayed point cloud with the origin of the coordinate system. The blue arrow is the Z axis, the red arrow is the X axis, and the green arrow is the Y axis.

        Knowing that the blue, red, and green arrows represent the Z, X, and Y axes respectively, you can see that the point cloud is represented in the same coordinate system as the Open3D coordinate system. Now, let's get the points with minimum and maximum values ​​for each axis: 

# Get max and min points of each axis x, y and z:
x_max = max(pcd.points, key=lambda x: x[0])
y_max = max(pcd.points, key=lambda x: x[1])
z_max = max(pcd.points, key=lambda x: x[2])
x_min = min(pcd.points, key=lambda x: x[0])
y_min = min(pcd.points, key=lambda x: x[1])
z_min = min(pcd.points, key=lambda x: x[2])
wAAACH5BAEKAAAAAAAAAABAAEAAAICRAEAOw==

        We could print them, but for better visualization we create a sphere geometry at each point location. By default, Open3D creates 3D geometry at the origin:

To move the sphere to a given position, a translation transformation is required. In the example below, the sphere is translated by the vector [1, 1, 1]:

Let's go back to our example and assign each sphere a color. For each position, we create a sphere and transform it to that position. We then assign the correct color and finally we add it to the final display.

# Colors:
RED = [1., 0., 0.]
GREEN = [0., 1., 0.]
BLUE = [0., 0., 1.]
YELLOW = [1., 1., 0.]
MAGENTA = [1., 0., 1.]
CYAN = [0., 1., 1.]

positions = [x_max, y_max, z_max, x_min, y_min, z_min]
colors = [RED, GREEN, BLUE, MAGENTA, YELLOW, CYAN]
for i in range(len(positions)):
   # Create a sphere mesh:
   sphere = o3d.geometry.TriangleMesh.create_sphere(radius=0.05)
   # move to the point position:
   sphere.translate(np.asarray(positions[i]))
   # add color:
   sphere.paint_uniform_color(np.asarray(colors[i]))
   # compute normals for vertices or faces:
   sphere.compute_vertex_normals()
   # add to geometry list to display later:
   geometries.append(sphere)

# Display:
o3d.visualization.draw_geometries(geometries)

        Well, we can see that the corresponding yellow sphere is on the wall and the corresponding green sphere is on the ground. In fact, the Y axis represents the height of the points: in the real world, the tallest sphere is the yellow one, and the lowest is the green one. However, since the Y axis points down, the yellow sphere has the minimum value and the green sphere has the maximum value.y_min y_max

        Another interesting sphere is the cyan sphere at the origin. As we mentioned in the previous tutorial, pixels with a depth value of 0 are noise points, so the point at the origin is the point calculated from these noise pixels (when and).z=0x=0y=0

3. Ground detection

        Now that we've shown some essentials, how do you detect the ground? In the previous example, the green sphere is on the ground. Specifically, its center corresponding to the highest point along the Y axis is a ground point. Suppose we change the color of all necessary points to green in order to detect the ground.y_max

        If you display the point cloud, you'll notice that not all ground points are green. In fact, only one point corresponding to the center of the previous green sphere is green. This is due to the accuracy and noise level of the depth camera.

        To overcome this limitation, we need to add a threshold so that points with y  coordinates are all considered ground points. To do this, after getting it, we check  if the y-  coordinate of each point is in that interval, and then set its color to green. Finally, we update the color property of the point cloud and display the result.[y_max-threshold, y_max]y_max

# Define a threshold:
THRESHOLD = 0.075

# Get the max value along the y-axis:
y_max = max(pcd.points, key=lambda x: x[1])[1]

# Get the original points color to be updated:
pcd_colors = np.asarray(pcd.colors)

# Number of points:
n_points = pcd_colors.shape[0]

# update color:
for i in range(n_points):
    # if the current point is aground point:
    if pcd.points[i][1] >= y_max - THRESHOLD:
        pcd_colors[i] = GREEN  # color it green

pcd.colors = o3d.utility.Vector3dVector(pcd_colors)

# Display:
o3d.visualization.draw_geometries([pcd, origin])

        In this example, we're only coloring the points representing the ground green. In practical applications, the ground is extracted to define walkable areas, such as robotic or visually impaired systems, or to place objects on them, such as interior design systems. It can also be removed, so the remaining points can be segmented or classified as in scene understanding and object detection systems.

4. Organized point clouds

        In our first tutorial, we defined a point cloud as a set of 3D points. A collection is an unordered structure, so a point cloud represented by a collection is called an unorganized point cloud. Similar to the RGB matrix, the organized point cloud is a 2D matrix with 3 channels representing the  x , y ,  and  z  coordinates of the points. The matrix structure provides the relationship between neighboring points, thus reducing the time complexity of some algorithms such as nearest neighbor.

        For example, suppose we are writing a research paper and we want to display the results of a ground detection algorithm as a graph. Unfortunately, there's no way to choose an animated character. Therefore, we can take a screenshot of the point cloud and also display the result on the depth image, as shown in the image below. In my opinion, the second option is the best. In this case, an organized point cloud is required to preserve the location of depth pixels.

Left: Screenshot of 3D visualization. Right: The result of the depth image.

        Let's create an organized point cloud from the last depth image. We start by importing the camera parameters as we did in the previous article. We also import the depth image and convert it to a 3-channel grayscale image so we can color the ground pixels green:

import imageio.v3 as iio
import numpy as np
import matplotlib.pyplot as plt

# Camera parameters:
FX_DEPTH = 5.8262448167737955e+02
FY_DEPTH = 5.8269103270988637e+02
CX_DEPTH = 3.1304475870804731e+02
CY_DEPTH = 2.3844389626620386e+02

# Read depth image:
depth_image = iio.imread('../data/depth_2.png')
# Compute the grayscale image:
depth_grayscale = np.array(256 * depth_image / 0x0fff, dtype=np.uint8)
# Convert a grayscale image to a 3-channel image:
depth_grayscale = np.stack((depth_grayscale,) * 3, axis=-1)

        To compute an organized point cloud, we proceed in the same way as in the previous tutorial. Instead of flattening the depth image, we reshape and have the same shape as the depth image, like this:jjii

# get depth image resolution:
height, width = depth_image.shape
# compute indices and reshape it to have the same shape as the depth image:
jj = np.tile(range(width), height).reshape((height, width))
ii = np.repeat(range(height), width).reshape((height, width))
# Compute constants:
xx = (jj - CX_DEPTH) / FX_DEPTH
yy = (ii - CY_DEPTH) / FY_DEPTH
# compute organised point cloud:
organized_pcd = np.dstack((xx * depth_image, yy * depth_image, depth_image))

        If you print the shape of the created point cloud, you can see that it is a matrix with 3 channels. If you find this code difficult to understand, please go back to the previous tutorial, and if it's still unclear, please feel free to leave me your questions and I'll be happy to help.(480, 640, 3)

        Again, we detect the ground as above, but instead of updating the color of the points and displaying the point cloud, we update the pixels of the grayscale image and display it:

# Ground_detection:
THRESHOLD = 0.075 * 1000  # Define a threshold
y_max = max(organized_pcd.reshape((height * width, 3)), key=lambda x: x[1])[
    1]  # Get the max value along the y-axis

# Set the ground pixels to green:
for i in range(height):
    for j in range(width):
        if organized_pcd[i][j][1] >= y_max - THRESHOLD:
            depth_grayscale[i][j] = [0, 255, 0]  # Update the depth image

# Display depth_grayscale:
plt.imshow(depth_grayscale)
plt.show()

V. Conclusion

        In this tutorial, to get used to point clouds, we have introduced the default coordinate system and implemented a simple ground detection algorithm. In fact, ground detection is an important task in some applications such as navigation, and several algorithms have been proposed in the literature. The implemented algorithm is simple; it treats the lowest point as the ground. However, it is limited in that the depth camera must be parallel to the ground, which is not the case for most practical applications.

Guess you like

Origin blog.csdn.net/gongdiwudu/article/details/132007106