Deep learning-install rangnet++ reasoning version under ubuntu18.04+RTX3080 full record of stepping on the pit

1. First find the github address of rangnet++ inference version
https://github.com/PRBonn/rangenet_lib
2. Install
2.1, TensorRT
TensorRT version and cuda version have a strict correspondence, although the author said that their code and pre-training model are only in It is valid on TensorRT-5.1, but because the cuda version of my computer is 11.2, the corresponding TensorRT version is not 5.1, and because other models have been installed locally, it is impossible to reinstall cuda, so I decided to install the corresponding version of cuda11.2 TensorRT.
Official website https://developer.nvidia.com/nvidia-tensorrt-8x-download View the TensorRT version corresponding to cuda11.2 is 8.0.0.3, find the corresponding version "TensorRT 8.0 EA for Linux and CUDA 11.1, 11.2 & 11.3" and click on the A "TensorRT 8.0 EA for Linux x86_64 and CUDA 11.1, 11.2 & 11.3 TAR package", as shown in the figure,
insert image description herewill download the installation package in the tar compression package format, the name is TensorRT-8.0.0.3.Linux.x86_64-gnu. cuda-11.3.cudnn8.2.tar.gz.
After downloading, I installed it under /usr/local

cd ~/Downloads
sudo tar -xzvf TensorRT-8.0.0.3.Linux.x86_64-gnu.cuda-11.3.cudnn8.2.tar.gz -C /usr/local/


Of course, you can configure environment variables anywhere you install

gedit ~/.bashrc

Add the path to the last line of the .bashrc file

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/TensorRT-8.0.0.3/lib

The configuration takes effect

source ~/.bashrc

Compiling the TensorRT example will generate a sample_mnist executable file for the MINIST dataset in the bin path

cd /usr/local/TensorRT-8.0.0.3/samples/sampleMNIST
sudo make
cd ../../bin/
./sample_mnist

Tips for a successful run

&&&& RUNNING TensorRT.sample_mnist [TensorRT v8000] # ./sample_mnist
[10/14/2022-17:01:44] [I] Building and running a GPU inference engine for MNIST
[10/14/2022-17:01:44] [I] [TRT] [MemUsageChange] Init CUDA: CPU +146, GPU +0, now: CPU 151, GPU 407 (MiB)
[10/14/2022-17:01:44] [I] [TRT] [MemUsageSnapshot] Builder begin: CPU 153 MiB, GPU 407 MiB
[10/14/2022-17:01:45] [W] [TRT] TensorRT was linked against cuBLAS/cuBLAS LT 11.4.2 but loaded cuBLAS/cuBLAS LT 11.4.1
[10/14/2022-17:01:45] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +182, GPU +76, now: CPU 335, GPU 483 (MiB)
[10/14/2022-17:01:45] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +128, GPU +80, now: CPU 463, GPU 563 (MiB)
[10/14/2022-17:01:45] [W] [TRT] TensorRT was linked against cuDNN 8.2.0 but loaded cuDNN 8.1.0
[10/14/2022-17:01:45] [W] [TRT] Detected invalid timing cache, setup a local cache instead
[10/14/2022-17:01:50] [I] [TRT] Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
[10/14/2022-17:01:50] [I] [TRT] Detected 1 inputs and 1 output network tensors.
[10/14/2022-17:01:50] [I] [TRT] Total Host Persistent Memory: 5440
[10/14/2022-17:01:50] [I] [TRT] Total Device Persistent Memory: 0
[10/14/2022-17:01:50] [I] [TRT] Total Scratch Memory: 0
[10/14/2022-17:01:50] [I] [TRT] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 1 MiB, GPU 4 MiB
[10/14/2022-17:01:50] [W] [TRT] TensorRT was linked against cuBLAS/cuBLAS LT 11.4.2 but loaded cuBLAS/cuBLAS LT 11.4.1
[10/14/2022-17:01:50] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 619, GPU 629 (MiB)
[10/14/2022-17:01:50] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 619, GPU 637 (MiB)
[10/14/2022-17:01:50] [W] [TRT] TensorRT was linked against cuDNN 8.2.0 but loaded cuDNN 8.1.0
[10/14/2022-17:01:50] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 619, GPU 621 (MiB)
[10/14/2022-17:01:50] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 619, GPU 605 (MiB)
[10/14/2022-17:01:50] [I] [TRT] [MemUsageSnapshot] Builder end: CPU 619 MiB, GPU 605 MiB
[10/14/2022-17:01:50] [W] [TRT] TensorRT was linked against cuBLAS/cuBLAS LT 11.4.2 but loaded cuBLAS/cuBLAS LT 11.4.1
[10/14/2022-17:01:50] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 617, GPU 613 (MiB)
[10/14/2022-17:01:50] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 617, GPU 621 (MiB)
[10/14/2022-17:01:50] [W] [TRT] TensorRT was linked against cuDNN 8.2.0 but loaded cuDNN 8.1.0
[10/14/2022-17:01:50] [I] Input:
@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@%.:@@@@@@@@@@@@
@@@@@@@@@@@@@: *@@@@@@@@@@@@
@@@@@@@@@@@@* =@@@@@@@@@@@@@
@@@@@@@@@@@% :@@@@@@@@@@@@@@
@@@@@@@@@@@- *@@@@@@@@@@@@@@
@@@@@@@@@@# .@@@@@@@@@@@@@@@
@@@@@@@@@@: #@@@@@@@@@@@@@@@
@@@@@@@@@+ -@@@@@@@@@@@@@@@@
@@@@@@@@@: %@@@@@@@@@@@@@@@@
@@@@@@@@+ +@@@@@@@@@@@@@@@@@
@@@@@@@@:.%@@@@@@@@@@@@@@@@@
@@@@@@@% -@@@@@@@@@@@@@@@@@@
@@@@@@@% -@@@@@@#..:@@@@@@@@
@@@@@@@% +@@@@@-    :@@@@@@@
@@@@@@@% =@@@@%.#@@- +@@@@@@
@@@@@@@@..%@@@*+@@@@ :@@@@@@
@@@@@@@@= -%@@@@@@@@ :@@@@@@
@@@@@@@@@- .*@@@@@@+ +@@@@@@
@@@@@@@@@@+  .:-+-: .@@@@@@@
@@@@@@@@@@@@+:    :*@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@

[10/14/2022-17:01:50] [I] Output:
0: 
1: 
2: 
3: 
4: 
5: 
6: **********
7: 
8: 
9: 

[10/14/2022-17:01:50] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 617, GPU 605 (MiB)
&&&& PASSED TensorRT.sample_mnist [TensorRT v8000] # ./sample_mnist


Reference: http://www.yaotu.net/biancheng/26353.html

3. Compile

$ mkdir -p ~/catkin_ws/rangenet++/src
$ cd ~/catkin_ws/rangenet++/src
$ git clone https://github.com/PRBonn/rangenet_lib.git
$ cd .. 
$ catkin_make

Compilation will report a lot of errors, because the TensorRT version is upgraded from 5.1 to 8.0. In the process of seeking a solution, I found that some masters have solved the problem of rangenet++ adapting to the TensorRT-8.0 version, and the code has been open source https://github.com/vincenzo0603/rangenet_lib_forTensorRT8XX .
So, directly change to compile this version of rangenet++ adapted to TensorRT-8.0 contributed by the great god. Some small bugs of TensorRT will be reported during the compilation process. You need to add the path of TensorRT-8.0.0.3 in CMakeLists.txt, as shown below

set(CMAKE_PREFIX_PATH "/usr/local/TensorRT-8.0.0.3/lib")
include_directories(/usr/local/TensorRT-8.0.0.3/include)
link_directories(/usr/local/TensorRT-8.0.0.3/lib/)

4. Run
After the compilation is passed, start testing a demo. First download a pre-training model model
here , then create a new model folder under the path of ~/catkin_ws/rangenet++/src/rangenet_lib, and put it in. The point cloud data of a single frame is here ~/catkin_ws/rangenet++/src/rangenet_lib/example/000000.bin The commands for reasoning and visualization are as follows

$ cd ~/catkin_ws/rangenet++
$ ./devel/lib/rangenet_lib/infer -p ~/catkin_ws/rangenet++/src/rangenet_lib/model/darknet53 -s ~/catkin_ws/rangenet++/src/rangenet_lib/example/000000.bin --verbose

The running result is shown in the figure below
insert image description here, which is obviously wrong. By modifying kFP16 to kTF32, as shown in the figure,
insert image description here
this problem is solved. The running effect is as follows. Note that after modification, model.trt must be deleted and recompiled to take effect.
insert image description hereSo far, the deployment of the rangenet++ inference version has been completed. However, this version cannot read the bag data of ros in real time, and it still cannot be used by me.
5. Rangenet++ real-time subscription ros message reasoning version installation
I was investigating how to enable rangenet++ to subscribe to ros messages in real time. When I used it, I found a bigger master. The master has realized it and open sourced the code https://github . com/Natsu-Akatsuki/RangeNetTrt8
The author said that this project is only applicable to TensorRT8.2.3, but I think it can be deployed with some minor changes with my current TensorRT8.0.0.3 version. The author provides two installation methods. Since the docker download environment requires a lot of space, I chose the second method, which is installed locally.
The onnx model is also downloaded before. Therefore, based on the environment where the rangenet++ installation was deployed earlier, the project only needs to install libtorch additionally. The libtorch version also has a strict correspondence with the cuda version, and the libtorch version is consistent with the pytorch version, because it has been verified that it is feasible to install the 1.8.1+cuda11.1 version of pytorch, so I chose to install the 1.8.1 version of libtorch , the download link https://download.pytorch.org/libtorch/cu111/libtorch-cxx11-abi-shared-with-deps-1.8.1%2Bcu111.zip
The installation command is as follows

unzip libtorch-cxx11-abi-shared-with-deps-1.8.1+cu111.zip 

Configure environment variables

gedit ~/.bashrc

Add the path to the last line of the .bashrc file

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/tools/libtorch/lib

The configuration takes effect

source ~/.bashrc

Reference: https://blog.csdn.net/agq358/article/details/121074551
https://blog.csdn.net/agq358/article/details/121078611

6. Rangenet++ real-time subscription ros message reasoning version compilation
Modify CMakeLists: modify the path of TensorRT, libtorch and other dependent libraries

$ catkin_make

Compilation problems and solutions:
Problem 1:
math_functions.hpp cannot be found
insert image description here

Solution:
There is no math_functions.hpp in /usr/local/cuda/include/, establish a soft connection

sudo ln -s /usr/local/cuda/include/crt/math_functions.hpp /usr/local/cuda/include/math_functions.hpp

Question 2:

[报错]In file included from /usr/local/cuda/include/cuda_runtime.h:115:0,
from :0:
/usr/local/cuda/include/crt/common_functions.h:74:24: error: token ““CUDACC_VER is no longer supported. Use CUDACC_VER_MAJOR, CUDACC_VER_MINOR, and CUDACC_VER_BUILD instead.”” is not valid in preprocessor expressions
#define CUDACC_VER “CUDACC_VER is no longer supported. Use CUDACC_VER_MAJOR, CUDACC_VER_MINOR, and CUDACC_VER_BUILD instead.
^
/usr/local/cuda/include/crt/common_functions.h:74:24: note: in definition of macro ‘CUDACC_VER’
#define CUDACC_VER “CUDACC_VER is no longer supported. Use CUDACC_VER_MAJOR, CUDACC_VER_MINOR, and CUDACC_VER_BUILD instead.

Solution:

cd /usr/local/cuda/include/crt
sudo gedit common_functions.h
将第74行对应内容直接注释

Question 3:
constexpr cannot be used in project_kernel.cu
Solution:
delete constexpr, as shown below

__device__  float means[] = {
    
    12.12, 10.88, 0.23, -1.04, 0.21};
__device__  float stds[] = {
    
    12.32, 11.47, 6.91, 0.86, 0.16};
// __device__ constexpr float means[] = {0.0, 0, 0.0, 0.0, 0.0};
// __device__ constexpr float stds[] = {1.0, 1.0, 1.0, 1.0, 1.0};

Question 4:
The utils/pointcloud_io.h file reports the error of pcd_io.h
Solution:
Change the

#include <pcl/io/pcd_io.h>

replace with

#include <pcl/point_cloud.h>
#include<fstream>  

Question 5:
insert image description here
camkelist changeQuestion
6:
insert image description herezhushi 596

7.
After the rangenet++ real-time subscription ros message reasoning version runs and compiles, convert kitti to bag. Since the kitti2bag software requires a python2.7 interpreter, and the current local python is 3.9, create a new virtual environment python2.7 and install it in it .
After generating the bag, modify the launch file and execute the following command to run

$ roslaunch /src/RangeNetTrt8/launch/rangenet.launch
$ roslaunch /src/RangeNetTrt8/launch/rosbag.launch

The effect is as follows

Screen Recording 2022-11-13 23:59:11

Guess you like

Origin blog.csdn.net/weixin_40826634/article/details/127836169