"Visual SLAM Fourteen Lectures" ch7 visual odometer 1 study notes (4) - practical part: solving PnP pose_estimation_3D2D

        In this blog, I mainly record my study notes for the practical part of the seventh lecture of "Visual SLAM Fourteen Lectures", the practical part of visual odometer 1, solving the code for solving PnP, 3d and 2d points to solve the camera pose.

0. Preparation:

        When running this code, the g2o library is required. The g2o download package can be downloaded from my gitcode:

sticker_ruan/g2o download package GitCode

The download steps of g2o and ceres can refer to:

g2o and ceres installation

1. Specific operation process:

(1) First, you need to modify the original CMakeLists.txt file, otherwise an error will be reported. The modified content is as follows:

cmake_minimum_required(VERSION 2.8)
 project(vo1)
set(CMAKE_BUILD_TYPE "Release")
add_definitions("-DENABLE_SSE")
# set(CMAKE_CXX_FLAGS "-std=c++11 -O2 ${SSE_FLAGS} -msse4")
set(CMAKE_CXX_FLAGS "-std=c++14 -O2 ${SSE_FLAGS} -msse4")
list(APPEND CMAKE_MODULE_PATH ${PROJECT_SOURCE_DIR}/cmake)

find_package(OpenCV 3 REQUIRED)
find_package(G2O REQUIRED)
find_package(Sophus REQUIRED)
include_directories(
${OpenCV_INCLUDE_DIRS}
${G2O_INCLUDE_DIRS}
${Sophus_INCLUDE_DIRS}
"/usr/local/include/eigen3/"
)

add_executable(orb_cv orb_cv.cpp)
target_link_libraries(orb_cv ${OpenCV_LIBS} fmt) #记住fmt前面一定要空格,不然会报错
add_executable(orb_self orb_self.cpp)
target_link_libraries(orb_self ${OpenCV_LIBS} fmt)

# #add_executable( pose_estimation_2d2d pose_estimation_2d2d.cpp extra.cpp ) # use this if in OpenCV2 
add_executable(pose_estimation_2d2d pose_estimation_2d2d.cpp)
target_link_libraries(pose_estimation_2d2d ${OpenCV_LIBS} fmt)

# # add_executable( triangulation triangulation.cpp extra.cpp) # use this if in opencv2
add_executable(triangulation triangulation.cpp)
target_link_libraries(triangulation ${OpenCV_LIBS} fmt)
add_executable(pose_estimation_3d2d pose_estimation_3d2d.cpp)
target_link_libraries(pose_estimation_3d2d
g2o_core g2o_stuff
${OpenCV_LIBS} fmt)
add_executable(pose_estimation_3d3d pose_estimation_3d3d.cpp)
target_link_libraries(pose_estimation_3d3d
g2o_core g2o_stuff
${OpenCV_LIBS} fmt)
# add_executable(pose_estimation_3d3d pose_estimation_3d3d.cpp)
# target_link_libraries(pose_estimation_3d3d
# g2o_core g2o_stuff
# ${OpenCV_LIBS})

 The main reason is that the address of eigen3 has been changed, the c++ standard has been changed from c++11 to c++14, and fmt is added after each link library. Note that a space must be added before fmt, otherwise, you don’t know what’s wrong ( Distressing myself for a minute...)

(2) Two color maps and two depth maps are used in this code (but in fact only two color maps and one depth map are used), you need to go to the build folder to check whether there are four Figure, because in most cases, these four pictures are stored in the folder of the previous level of the build, and then you enter the command in the terminal to display that the core has been dumped (because the picture entered by the terminal cannot be found). Next there are two methods:

a. Put the four pictures into the build folder , as shown below:

 Then enter in the terminal:

./pose_estimation_3d2d 1.png 2.png 1_depth.png 2_depth.png

b. Modify the code content

        In the pose_estimation_3d2d code, find the positions of argv[1], argv[2], and argv[3], and change these to the corresponding image addresses, for example:

argv[1]——>/home/rxz/slambook2/ch7/1.png

argv[2]——>/home/rxz/slambook2/ch7/2.png

argv[3]——>/home/rxz/slambook2/ch7/1_depth.png

Note: The addresses here are the locations where the pictures correspond.

2. Code explanation

2.1 Code flow

Source code (with detailed comments):

#include <iostream>
#include <opencv2/core/core.hpp>//opencv核心模块
#include <opencv2/features2d/features2d.hpp>//opencv特征点
#include <opencv2/highgui/highgui.hpp>//opencv gui
#include <opencv2/calib3d/calib3d.hpp>//求解器头文件
#include <Eigen/Core>//eigen核心模块
#include <g2o/core/base_vertex.h>//g2o顶点(Vertex)头文件 视觉slam十四讲p141用顶点表示优化变量,用边表示误差项
#include <g2o/core/base_unary_edge.h>//g2o边(edge)头文件
#include <g2o/core/sparse_optimizer.h>//稠密矩阵求解
#include <g2o/core/block_solver.h>//求解器头文件
#include <g2o/core/solver.h>
#include <g2o/core/optimization_algorithm_gauss_newton.h>//高斯牛顿算法头文件
#include <g2o/solvers/dense/linear_solver_dense.h>//线性求解
#include <sophus/se3.hpp>//李群李代数se3
#include <chrono>


using namespace std;
using namespace cv;

void find_feature_matches(
  const Mat &img_1, const Mat &img_2,
  std::vector<KeyPoint> &keypoints_1,
  std::vector<KeyPoint> &keypoints_2,
  std::vector<DMatch> &matches);

// 像素坐标转相机归一化坐标
Point2d pixel2cam(const Point2d &p, const Mat &K);

// BA by g2o
typedef vector<Eigen::Vector2d, Eigen::aligned_allocator<Eigen::Vector2d>> VecVector2d;
typedef vector<Eigen::Vector3d, Eigen::aligned_allocator<Eigen::Vector3d>> VecVector3d;

void bundleAdjustmentG2O(
  const VecVector3d &points_3d,
  const VecVector2d &points_2d,
  const Mat &K,
  Sophus::SE3d &pose
);

// BA by gauss-newton
void bundleAdjustmentGaussNewton(
  const VecVector3d &points_3d,
  const VecVector2d &points_2d,
  const Mat &K,
  Sophus::SE3d &pose
);

int main(int argc, char **argv) {
  if (argc != 4) {
    cout << "usage: pose_estimation_3d2d img1 img2 depth1 depth2" << endl;
    return 1;
  }
  //-- 读取图像
  Mat img_1 = imread(argv[1], CV_LOAD_IMAGE_COLOR);
  Mat img_2 = imread(argv[2], CV_LOAD_IMAGE_COLOR);
  assert(img_1.data && img_2.data && "Can not load images!");

  vector<KeyPoint> keypoints_1, keypoints_2;
  vector<DMatch> matches;
  find_feature_matches(img_1, img_2, keypoints_1, keypoints_2, matches);
  cout << "一共找到了" << matches.size() << "组匹配点" << endl;

  // 建立3D点 在第一张图的深度图中寻找它们的深度,并求出空间位置,得到3D点
  Mat d1 = imread(argv[3], CV_LOAD_IMAGE_UNCHANGED);       // 深度图为16位无符号数,单通道图像
  Mat K = (Mat_<double>(3, 3) << 520.9, 0, 325.1, 0, 521.0, 249.7, 0, 0, 1);  //相机内参矩阵
  vector<Point3f> pts_3d;
  vector<Point2f> pts_2d;
  for (DMatch m:matches) {
    ushort d = d1.ptr<unsigned short>(int(keypoints_1[m.queryIdx].pt.y))[int(keypoints_1[m.queryIdx].pt.x)];
    if (d == 0)   // bad depth 排除深度为0的关键点
      continue;
    float dd = d / 5000.0;
    Point2d p1 = pixel2cam(keypoints_1[m.queryIdx].pt, K); //像素坐标转为相机坐标
    pts_3d.push_back(Point3f(p1.x * dd, p1.y * dd, dd)); //得到像素点在相机坐标下的3D信息
    pts_2d.push_back(keypoints_2[m.trainIdx].pt);  //以第二个图像的特征点位置作为2D点
  }

  cout << "3d-2d pairs: " << pts_3d.size() << endl;

  chrono::steady_clock::time_point t1 = chrono::steady_clock::now();
  Mat r, t;
  //1.PnP求解  opencv求解
  solvePnP(pts_3d, pts_2d, K, Mat(), r, t, false); // 调用OpenCV 的 PnP 求解,可选择EPNP,DLS等方法
  Mat R;
  cv::Rodrigues(r, R); // r为旋转向量形式,用Rodrigues公式转换为矩阵 见书上P53 式(3.15)
  chrono::steady_clock::time_point t2 = chrono::steady_clock::now();
  chrono::duration<double> time_used = chrono::duration_cast<chrono::duration<double>>(t2 - t1);
  cout << "solve pnp in opencv cost time: " << time_used.count() << " seconds." << endl;

  cout << "R=" << endl << R << endl;
  cout << "t=" << endl << t << endl;
  //2.最小化重投影误差求最优解
  VecVector3d pts_3d_eigen;
  VecVector2d pts_2d_eigen;
  for (size_t i = 0; i < pts_3d.size(); ++i) {
    pts_3d_eigen.push_back(Eigen::Vector3d(pts_3d[i].x, pts_3d[i].y, pts_3d[i].z));
    pts_2d_eigen.push_back(Eigen::Vector2d(pts_2d[i].x, pts_2d[i].y));
  }
//(1)利用高斯牛顿法获取相机位资R和T
  cout << "calling bundle adjustment by gauss newton" << endl;
  Sophus::SE3d pose_gn;
  t1 = chrono::steady_clock::now();
  bundleAdjustmentGaussNewton(pts_3d_eigen, pts_2d_eigen, K, pose_gn);
  t2 = chrono::steady_clock::now();
  time_used = chrono::duration_cast<chrono::duration<double>>(t2 - t1);
  cout << "solve pnp by gauss newton cost time: " << time_used.count() << " seconds." << endl;
//(2)通过g2o图优化算法
  cout << "calling bundle adjustment by g2o" << endl;
  Sophus::SE3d pose_g2o;
  t1 = chrono::steady_clock::now();
  bundleAdjustmentG2O(pts_3d_eigen, pts_2d_eigen, K, pose_g2o);
  t2 = chrono::steady_clock::now();
  time_used = chrono::duration_cast<chrono::duration<double>>(t2 - t1);
  cout << "solve pnp by g2o cost time: " << time_used.count() << " seconds." << endl;
  return 0;
}

void find_feature_matches(const Mat &img_1, const Mat &img_2,
                          std::vector<KeyPoint> &keypoints_1,
                          std::vector<KeyPoint> &keypoints_2,
                          std::vector<DMatch> &matches) {
  //-- 初始化
  Mat descriptors_1, descriptors_2;
  // used in OpenCV3
  Ptr<FeatureDetector> detector = ORB::create();
  Ptr<DescriptorExtractor> descriptor = ORB::create();
  // use this if you are in OpenCV2
  // Ptr<FeatureDetector> detector = FeatureDetector::create ( "ORB" );
  // Ptr<DescriptorExtractor> descriptor = DescriptorExtractor::create ( "ORB" );
  Ptr<DescriptorMatcher> matcher = DescriptorMatcher::create("BruteForce-Hamming");
  //-- 第一步:检测 Oriented FAST 角点位置
  detector->detect(img_1, keypoints_1);
  detector->detect(img_2, keypoints_2);

  //-- 第二步:根据角点位置计算 BRIEF 描述子
  descriptor->compute(img_1, keypoints_1, descriptors_1);
  descriptor->compute(img_2, keypoints_2, descriptors_2);

  //-- 第三步:对两幅图像中的BRIEF描述子进行匹配,使用 Hamming 距离
  vector<DMatch> match;
  // BFMatcher matcher ( NORM_HAMMING );
  matcher->match(descriptors_1, descriptors_2, match);

  //-- 第四步:匹配点对筛选
  double min_dist = 10000, max_dist = 0;

  //找出所有匹配之间的最小距离和最大距离, 即是最相似的和最不相似的两组点之间的距离
  for (int i = 0; i < descriptors_1.rows; i++) {
    double dist = match[i].distance;
    if (dist < min_dist) min_dist = dist;
    if (dist > max_dist) max_dist = dist;
  }

  printf("-- Max dist : %f \n", max_dist);
  printf("-- Min dist : %f \n", min_dist);

  //当描述子之间的距离大于两倍的最小距离时,即认为匹配有误.但有时候最小距离会非常小,设置一个经验值30作为下限.
  for (int i = 0; i < descriptors_1.rows; i++) {
    if (match[i].distance <= max(2 * min_dist, 30.0)) {
      matches.push_back(match[i]);
    }
  }
}

Point2d pixel2cam(const Point2d &p, const Mat &K) {
  return Point2d
    (
      (p.x - K.at<double>(0, 2)) / K.at<double>(0, 0),
      (p.y - K.at<double>(1, 2)) / K.at<double>(1, 1)
    );
}

void bundleAdjustmentGaussNewton(
  const VecVector3d &points_3d,
  const VecVector2d &points_2d,
  const Mat &K,
  Sophus::SE3d &pose) {
  typedef Eigen::Matrix<double, 6, 1> Vector6d;
  const int iterations = 10;
  double cost = 0, lastCost = 0;
  //读取相机内参中的一些参数
  double fx = K.at<double>(0, 0);
  double fy = K.at<double>(1, 1);
  double cx = K.at<double>(0, 2);
  double cy = K.at<double>(1, 2);

  for (int iter = 0; iter < iterations; iter++) {
    Eigen::Matrix<double, 6, 6> H = Eigen::Matrix<double, 6, 6>::Zero();//定义一个6*6的零矩阵
    Vector6d b = Vector6d::Zero(); //定义6*1的列向量,元素全为0

    cost = 0;
    // compute cost
    for (int i = 0; i < points_3d.size(); i++) {
      Eigen::Vector3d pc = pose * points_3d[i]; //P' = TP =[X',Y',Z'] 视觉slam十四讲p186式7.38
      double inv_z = 1.0 / pc[2];//相当于1 / Z'
      double inv_z2 = inv_z * inv_z;//相当于( 1 / Z' ) * ( 1 / Z' )
      Eigen::Vector2d proj(fx * pc[0] / pc[2] + cx, fy * pc[1] / pc[2] + cy);//转化为像素坐标 u = fx(X' / z') +cx,v = fy(Y' / z') +cy 视觉slam十四讲p186式7.41
      Eigen::Vector2d e = points_2d[i] - proj;   //比较两个向量,一个是之前利用特征提取得出来的2d像素坐标,一个是现在将3D点投影到2D坐标上

      cost += e.squaredNorm(); //squaredNorm()矩阵/向量所有元素的平方和
      Eigen::Matrix<double, 2, 6> J;//2*6雅克比矩阵 视觉slam十四讲p186式7.46
      //雅克比矩阵表达式见 视觉slam十四讲p186式7.46  
       //导数矩阵 
      J << -fx * inv_z, // -fx * inv_z 相当于-fx / Z'
        0,//0
        fx * pc[0] * inv_z2, //fx * pc[0] * inv_z2相当于fx * X' / ( Z' * Z' )
        fx * pc[0] * pc[1] * inv_z2,
        -fx - fx * pc[0] * pc[0] * inv_z2,//-fx - fx * pc[0] * pc[0] * inv_z2相当于fx * X' * Y' / ( Z' * Z')
        fx * pc[1] * inv_z,    //fx * pc[1] * inv_z相当于fx * Y' / Z'
        0,//0
        -fy * inv_z,  //-fy * inv_z相当于-fy / Z'
        fy * pc[1] * inv_z2,    //fy * pc[1] * inv_z2相当于fy * Y' / (Z' * Z')
        fy + fy * pc[1] * pc[1] * inv_z2,    //fy + fy * pc[1] * pc[1] * inv_z2相当于fy + fy * Y'*Y' / (Z' * Z')
        -fy * pc[0] * pc[1] * inv_z2,    //-fy * pc[0] * pc[1] * inv_z2相当于fy * X' * Y' / ( Z' * Z')
        -fy * pc[0] * inv_z;//-fy * pc[0] * inv_z相当于-fy * X' / Z'
 
      H += J.transpose() * J;
      b += -J.transpose() * e;
    }

    Vector6d dx;
    dx = H.ldlt().solve(b);

    if (isnan(dx[0])) {
      cout << "result is nan!" << endl;
      break;
    }

    if (iter > 0 && cost >= lastCost) {
      // cost increase, update is not good
      cout << "cost: " << cost << ", last cost: " << lastCost << endl;
      break;
    }

    // update your estimation
    pose = Sophus::SE3d::exp(dx) * pose;
    lastCost = cost;

    cout << "iteration " << iter << " cost=" << std::setprecision(12) << cost << endl;
    if (dx.norm() < 1e-6) {
      // converge
      break;
    }
  }

  cout << "pose by g-n: \n" << pose.matrix() << endl;
}

/// vertex and edges used in g2o ba
class VertexPose : public g2o::BaseVertex<6, Sophus::SE3d> {
public:
  EIGEN_MAKE_ALIGNED_OPERATOR_NEW;

  virtual void setToOriginImpl() override {
    _estimate = Sophus::SE3d();
  }

  /// left multiplication on SE3
  virtual void oplusImpl(const double *update) override {
    Eigen::Matrix<double, 6, 1> update_eigen;
    update_eigen << update[0], update[1], update[2], update[3], update[4], update[5];
    _estimate = Sophus::SE3d::exp(update_eigen) * _estimate;
  }

  virtual bool read(istream &in) override {}

  virtual bool write(ostream &out) const override {}
};

class EdgeProjection : public g2o::BaseUnaryEdge<2, Eigen::Vector2d, VertexPose> {
public:
  EIGEN_MAKE_ALIGNED_OPERATOR_NEW;

  EdgeProjection(const Eigen::Vector3d &pos, const Eigen::Matrix3d &K) : _pos3d(pos), _K(K) {}

  virtual void computeError() override {
    const VertexPose *v = static_cast<VertexPose *> (_vertices[0]);
    Sophus::SE3d T = v->estimate();
    Eigen::Vector3d pos_pixel = _K * (T * _pos3d);
    pos_pixel /= pos_pixel[2];
    _error = _measurement - pos_pixel.head<2>();
  }

  virtual void linearizeOplus() override {
    const VertexPose *v = static_cast<VertexPose *> (_vertices[0]);
    Sophus::SE3d T = v->estimate();
    Eigen::Vector3d pos_cam = T * _pos3d;
    double fx = _K(0, 0);
    double fy = _K(1, 1);
    double cx = _K(0, 2);
    double cy = _K(1, 2);
    double X = pos_cam[0];
    double Y = pos_cam[1];
    double Z = pos_cam[2];
    double Z2 = Z * Z;
    _jacobianOplusXi
      << -fx / Z, 0, fx * X / Z2, fx * X * Y / Z2, -fx - fx * X * X / Z2, fx * Y / Z,
      0, -fy / Z, fy * Y / (Z * Z), fy + fy * Y * Y / Z2, -fy * X * Y / Z2, -fy * X / Z;
  }

  virtual bool read(istream &in) override {}

  virtual bool write(ostream &out) const override {}

private:
  Eigen::Vector3d _pos3d;
  Eigen::Matrix3d _K;
};

void bundleAdjustmentG2O(
  const VecVector3d &points_3d,
  const VecVector2d &points_2d,
  const Mat &K,
  Sophus::SE3d &pose) {

  // 构建图优化,先设定g2o
  typedef g2o::BlockSolver<g2o::BlockSolverTraits<6, 3>> BlockSolverType;  // pose is 6, landmark is 3
  typedef g2o::LinearSolverDense<BlockSolverType::PoseMatrixType> LinearSolverType; // 线性求解器类型
  // 梯度下降方法,可以从GN, LM, DogLeg 中选
  auto solver = new g2o::OptimizationAlgorithmGaussNewton(
    g2o::make_unique<BlockSolverType>(g2o::make_unique<LinearSolverType>()));
  g2o::SparseOptimizer optimizer;     // 图模型
  optimizer.setAlgorithm(solver);   // 设置求解器
  optimizer.setVerbose(true);       // 打开调试输出

  // vertex
  VertexPose *vertex_pose = new VertexPose(); // camera vertex_pose
  vertex_pose->setId(0);
  vertex_pose->setEstimate(Sophus::SE3d());
  optimizer.addVertex(vertex_pose);

  // K
  Eigen::Matrix3d K_eigen;
  K_eigen <<
          K.at<double>(0, 0), K.at<double>(0, 1), K.at<double>(0, 2),
    K.at<double>(1, 0), K.at<double>(1, 1), K.at<double>(1, 2),
    K.at<double>(2, 0), K.at<double>(2, 1), K.at<double>(2, 2);

  // edges
  int index = 1;
  for (size_t i = 0; i < points_2d.size(); ++i) {
    auto p2d = points_2d[i];
    auto p3d = points_3d[i];
    EdgeProjection *edge = new EdgeProjection(p3d, K_eigen);
    edge->setId(index);
    edge->setVertex(0, vertex_pose);
    edge->setMeasurement(p2d);
    edge->setInformation(Eigen::Matrix2d::Identity());
    optimizer.addEdge(edge);
    index++;
  }

  chrono::steady_clock::time_point t1 = chrono::steady_clock::now();
  optimizer.setVerbose(true);
  optimizer.initializeOptimization();
  optimizer.optimize(10);
  chrono::steady_clock::time_point t2 = chrono::steady_clock::now();
  chrono::duration<double> time_used = chrono::duration_cast<chrono::duration<double>>(t2 - t1);
  cout << "optimization costs time: " << time_used.count() << " seconds." << endl;
  cout << "pose estimated by g2o =\n" << vertex_pose->estimate().matrix() << endl;
  pose = vertex_pose->estimate();
}

2.2 Function explanation

solvePnP(pts_3d, pts_2d, K, Mat(), r, t, false);//solve PNP problem

pts_3d: the world coordinates of the feature points, the coordinate values ​​must be float type, not double type, can be mat type, or directly input vector;

pts_2d: The pixel coordinates of the feature points in the image, which can be input in mat type or vector, but the order of the input points should be consistent with the world coordinates of the previous feature points;

K: camera internal parameter matrix

Mat(): Distortion parameters of the camera [mat_()]

r: the rotation vector of the output

t: translation matrix for the output

2.3 Running results

 

 

Guess you like

Origin blog.csdn.net/weixin_70026476/article/details/127919447