surf feature extraction



Original text: https://blog.csdn.net/cxp2205455256/article/details/41311013


1.
    Integral image The concept of integral image was proposed by Viola and Jones. The value of any point (i, j) in the integral image is the sum of the grayscale values ​​of the corresponding focus area from the upper left corner of the original image to any point (i, j), and its mathematical formula is shown in Figure 1:



Then, when we want to calculate the integral of an area of ​​the image, we only need to calculate the value of the four vertices of this area in the integral image, which can be calculated by 2-step addition and 2-step subtraction. The mathematical formula is as follows:


2. Hession matrix detector
1. Spot detection
    Spot: an area with a difference in color and grayscale from the surrounding area.
    In a one-dimensional signal, convolve it with the second derivative of Gaussian, that is, Laplace transform, then a zero-crossing point will appear at the edge of the signal, as shown in the following figure:


    The response value of the Gaussian Laplace Log detector is to measure the similarity of the image. The following figure is a three-dimensional and grayscale image of the Gaussian Laplace transform of an image. The size of the spots in the image is related to the Gaussian Laplace. When the shape of the Las Vegas function
tends to be consistent, the Laplace response of the image reaches the maximum.


    The Hession matrix is ​​the use of second-order differential for blob detection. The matrix is ​​as follows:


2. Hession matrix and box filter

    The Hession matrix in the image is as follows:


Their 3D and grayscale images are shown below:


From this, we convert the convolution operation of the template (region in the image) and the image into the operation of the box filter, as shown in the following figure:


3. Simplification of the determinant of the hession matrix

When we use Gaussian second-order differential filtering with sigma = 1.2. and the template size is 9X9 with the smallest scale space value to filter and detect the image, the determinant of the Hession matrix can be simplified as follows:



The constant C does not affect the comparison of extreme points, so the final simplified formula is as follows, which is also the source of the Hession response value calculation formula in the SURF paper:


    In addition, the response value should be normalized according to the filter size to ensure that the F norm of any size filter is uniform. 0.9^2 is the correlation weight w of the filter response is to balance the representation of the Hessian determinant. This is to keep the Gaussian kernel consistent with the approximate Gaussian kernel. Theoretically, the value of w is different for different values ​​of σ and the template size of the corresponding size, but for simplicity, it can be considered to be the same constant.  The approximated Hessian matrix determinant is used to represent the blob response value at a certain point x in the image, and all pixel points in the image are traversed to form the response image of glazed point detection at a certain scale. using different template sizes,

Then a pyramid image of multi-scale blob response is formed, and the search for extreme points of blob response can be carried out by using this pyramid image.

3. 3D non-maximum suppression

   1. Scale pyramid structure

    In SURF, the Hession matrix response is obtained by increasing the size of the box filter template and the integral image, and then 3D non-maximum suppression is used on the response image to obtain various scales.

, the following are two different pyramids, the pyramid of SURF belongs to the second type:


    In SURF, a 9X9 size filter is used as the starting filter, and the subsequent filter size can be calculated by the following formula:


Both octave and interval start from 1 in the formula, that is, when the 0th group is the 0th layer, octave = 1 and interval = 1 in the formula.

    The rationale for defining the filter size in this way is as follows;


    filter response length, filter size, group index O, layer index S

The relationship between the scale sigma is as follows:


    A similar approach was used for other sets of template sequences. This is done by doubling the filter size increase (6, 12, 24, 38). In this way, the filter sizes of the second group can be obtained, which are 15, 27, 39, and 51, respectively. The filter sizes of the third group are 27, 51, 75, 99. If the size of the original image is still larger than the corresponding filter size, the scale-space analysis can also be performed in the fourth group, whose corresponding template sizes are 51, 99, 147, and 195, respectively. The figure below shows the filter size change for groups 1 to 3: 


    In the case of normal scale analysis, the number of detected blobs decreases rapidly as the scale increases. Therefore, generally 3-4 groups are enough. At the same time, in order to reduce the amount of calculation and improve the speed of calculation, it can be considered to set the sampling interval to 2 when filtering.
    2. In order to locate interest points in images and different sizes, we use 3×3×3 neighborhood non-maximum suppression:
all values ​​less than the preset extreme value are discarded, and increasing the extreme value makes the detected feature points The number is reduced, and eventually only a few feature strongest points will be detected. In the detection process, a filter of the size corresponding to the image resolution of the scale layer is used for detection. Taking a 3×3 filter as an example, one of the 9 pixel points in the image of this scale layer is detected. Figure 2 Detection feature point and its own scale layer The remaining 8 points are compared with 9 points in the two scale layers above and below it, a total of 26 points, if the eigenvalue of the pixel marked with 'x' in the figure is greater than the surrounding pixels, the point can be determined are the feature points in this area.


    3. Precise localization of local maxima

    A 3-dimensional linear interpolation method is used to obtain sub-pixel feature points, and at the same time, those points whose values ​​are less than a certain threshold are also removed.


Fourth, the feature point descriptor

1. Feature point orientation assignment

Taking the feature point as the center, calculate the Haar wavelet (Haar wavelet side length is 4s) response     of points in the neighborhood with a radius of 6s (S is the scale value of the feature point) in the x and y directions, and the Harr wavelet

The template is shown in the figure:


    After calculating the response values ​​of the image in the x and y directions of the Haar wavelet, the two values ​​are weighted by a Gaussian with a factor of 2S, and the weighted values ​​represent the direction components in the horizontal and vertical directions respectively.

The Harr eigenvalue reflects the gray level change of the image, so the main direction is to describe the direction of the region where the gray level changes particularly violently.

    Next, take the feature point as the center and the fan-shaped sliding with the opening angle of π/3, calculate the accumulation of the Harr wavelet response values ​​dx and dy in the window:


    The sliding of the fan-shaped window is shown in the figure:


    The C++ code implementation in OpenSURF is as follows:

  1. for(int i = -6; i <= 6; ++i)   
  2. {  
  3.   for(int j = -6; j <= 6; ++j)   
  4.   {  
  5.     if(i*i + j*j < 36)   
  6.     {  
  7.       gauss = static_cast<float>(gauss25[id[i+6]][id[j+6]]);  
  8.       resX[idx] = gauss * haarX(r+j*s, c+i*s, 4*s);  
  9.       resY[idx] = gauss * haarY(r+j*s, c+i*s, 4*s);  
  10.       Ang[idx] = getAngle(resX[idx], resY[idx]);  
  11.       ++idx;  
  12.     }  
  13.   }  
  14. }  
  for(int i = -6; i <= 6; ++i) 
  {
    for(int j = -6; j <= 6; ++j) 
    {
      if(i*i + j*j < 36) 
      {
        gauss = static_cast<float>(gauss25[id[i+6]][id[j+6]]);
        resX[idx] = gauss * haarX(r+j*s, c+i*s, 4*s);
        resY[idx] = gauss * haarY(r+j*s, c+i*s, 4*s);
        Ang[idx] = getAngle(resX[idx], resY[idx]);
        ++idx;
      }
    }
  }

    通过i,j来控制以特征点为中心的6X6的范围,(i*i + j*j < 36)则筛选落在以特征点为中心,半径为6s的圆形区域内的点,然后计算HaarX与HarrY并与通过事先计算好的Gauss

滤波。并计算出每个点的角度。

    最后将最大值那个扇形的方向作为该特征点的主方向。在OpenSURF寻找扇形中具有最大值得方向代码如下:

  1. for(ang1 = 0; ang1 < 2*pi;  ang1+=0.15f) {  
  2.    ang2 = ( ang1+pi/3.0f > 2*pi ? ang1-5.0f*pi/3.0f : ang1+pi/3.0f);  
  3.    sumX = sumY = 0.f;   
  4.    for(unsigned int k = 0; k < Ang.size(); ++k)   
  5.    {  
  6.      // get angle from the x-axis of the sample point  
  7.      const float & ang = Ang[k];  
  8.   
  9.      // determine whether the point is within the window  
  10.      if (ang1 < ang2 && ang1 < ang && ang < ang2)   
  11.      {  
  12.        sumX+=resX[k];    
  13.        sumY+=resY[k];  
  14.      }   
  15.      else if (ang2 < ang1 &&   
  16.        ((ang > 0 && ang < ang2) || (ang > ang1 && ang < 2*pi) ))   
  17.      {  
  18.        sumX+=resX[k];    
  19.        sumY+=resY[k];  
  20.      }  
  21.    }  
  22.   
  23.    // if the vector produced from this window is longer than all   
  24.    // previous vectors then this forms the new dominant direction  
  25.    if (sumX*sumX + sumY*sumY > max)   
  26.    {  
  27.      // store largest orientation  
  28.      max = sumX*sumX + sumY*sumY;  
  29.      orientation = getAngle(sumX, sumY);  
  30.    }  
  31.  }  
 for(ang1 = 0; ang1 < 2*pi;  ang1+=0.15f) {
    ang2 = ( ang1+pi/3.0f > 2*pi ? ang1-5.0f*pi/3.0f : ang1+pi/3.0f);
    sumX = sumY = 0.f; 
    for(unsigned int k = 0; k < Ang.size(); ++k) 
    {
      // get angle from the x-axis of the sample point
      const float & ang = Ang[k];
 
      // determine whether the point is within the window
      if (ang1 < ang2 && ang1 < ang && ang < ang2) 
      {
        sumX+=resX[k];  
        sumY+=resY[k];
      } 
      else if (ang2 < ang1 && 
        ((ang > 0 && ang < ang2) || (ang > ang1 && ang < 2*pi) )) 
      {
        sumX+=resX[k];  
        sumY+=resY[k];
      }
    }
 
    // if the vector produced from this window is longer than all 
    // previous vectors then this forms the new dominant direction
    if (sumX*sumX + sumY*sumY > max) 
    {
      // store largest orientation
      max = sumX*sumX + sumY*sumY;
      orientation = getAngle(sumX, sumY);
    }
  }

2、特征点特征矢量的生成

    以特征点为中心,沿主方向将20SX20S的图像划分为4X4个子块,每个子块用尺寸2S的Harr模板进行响应值计算,并统计每个子块中


这样就有4X4X4=64维的特征数据。如下图所示:

在计算这个矩形区域时并不是先把它旋转到主方向,而是先计算出每一个点的Harr响应值dx、dy并高斯加权处理后,把dx、dy进行旋转变换,计算

公式如下:


    在OpenSURF的实现源码中采用的是另外一种方式,通过点旋转公式,把点旋转到主方向上并进行最近邻插值的对应点,公式如下:



五、匹配

     为了加速匹配过程,SURF借助Laplacian(在之前计算Hessian是可以顺便得出,不占用太多的时间)的符号使匹配过程索引加快。这样可以将下面的情况区分开,然后在进行描述符匹配:


参考资料:

http://www.cnblogs.com/ronny/p/4045979.html

http://doc.okbase.net/ronny/archive/107771.html

http://www.tuicool.com/articles/2i6fqq3

http://wenku.baidu.com/view/cf0c164f2e3f5727a5e96238.html

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324888251&siteId=291194637