OpenCV: Use python-cv2+Hog feature+SVM to realize lion recognition

SVM

Support Vector Machine: Seek an optimal hyperplane to achieve sample classification
Insert picture description here

Below we use SVM to implement a problem of classifying male and female students according to height and weight

import cv2
import numpy as np
import matplotlib.pyplot as plt
# 准备数据
rand1 = np.array([[155,48],[159,50],[164,53],[168,56],[172,60]])
rand2 = np.array([[152,53],[156,55],[160,56],[172,64],[176,65]])
# 0为女生 1为男生
label = np.array([[0],[0],[0],[0],[0],[1],[1],[1],[1],[1]])

data = np.vstack((rand1,rand2)) # 合并两组数据
data = np.array(data, dtype = 'float32')

svm = cv2.ml.SVM_create() # 创建svm学习模型
svm.setType(cv2.ml.SVM_C_SVC) # 类型为svm分类
svm.setKernel(cv2.ml.SVM_LINEAR) # 设置svm的内核为线性分类器
svm.setC(0.01)
#训练
svm.train(data,cv2.ml.ROW_SAMPLE,label)
# 预测
pt_data = np.array([[167,55],[162,57]])
pt_data = np.array(pt_data, dtype = 'float32')
#pt_label = [[0],[1]]
predict = svm.predict(pt_data)
predict[1]
array([[0.],
       [1.]], dtype=float32)

Hog feature

Look at the following example:
Insert picture description here

Here is an image img (the entire white area), and the winwindow (blue area) is the largest template calculated by the Hog feature in the image. The official template size is 64*128. block(The red area) is wina small template in the window, the size is generally 16*16. blockThere are many small templates cell(green areas) in the middle, and the size is generally 8*8.

Cell bin : Obtain the size and direction of the gradient by calculating the gradient of the pixel. The direction is 0-360 degrees. If you divide by 40 degrees, you will get 9 blocks. Set these 9 blocks into 9 cells, and each cell is one bin.

Hot feature dimensions: win窗口中block模板的个数 * cell模板个数 * bin的个数

Hog feature: pixels have a gradient, and all the pixel gradients in the win window constitute the hog feature

How to calculate the gradient:

We use two templates: a horizontal gradient template [1 0 -1]and a vertical gradient template [[1],[0],[-1]], that is, the difference between adjacent pixels.
Find the amplitude: f = sqrt(x^2 + y^2),angleangle = arctan(a / b)

Division of bins: If you divide by 40 degrees, you will get 9 bins. The area of ​​bin1 is (0-20 degrees) and 180-200 degrees, which is a symmetrical angle of 180 degrees

If the angle of a certain gradient is exactly in the center of the bin angle range, such as d = 10, it is planned to the bin1 area. Otherwise, the gradient is decomposed into two adjacent bin units:d1 = d * d(夹角), d2 = d * (1 - d(夹角))

Calculate the overall hog feature

1. First calculate cellthe value of all bins in each bin, and the calculation method of each bin is the sum of the amplitudes of all the bins divided into itsum(d)

2. Get the dimension of the image feature, taking the above as an example, the feature dimension of the image iswin窗口中block模板的个数 * cell模板个数 * bin的个数 = 105* 4 * 9 = 3780

3. By using the svm support vector machine to classify the features, a 3780-dimensional classification result is obtained, and a value f is obtained with hog * svm, and f is compared with our decision threshold. If it is greater than the decision threshold, it is considered as the target.

Hog feature + SVM to realize lion recognition

Here, 820 pictures of positive samples (PosNum) and 1931 pictures of negative samples (NegNum) are used to train the model. Finally, the training is completed, and the pictures with the little lion are used for testing

The data set of the positive and negative samples used here is as follows in the Baidu cloud link:
link: https://pan.baidu.com/s/1jNpN8ecMKhOHLiy1KlEj4w
Extraction code: 61hr

The training steps are as follows:

1. Set parameters

2. Create Hog: We use cv2.HOGDescriptor(winSize,blockSize,blockStride,cellSize,Bin)functions to create

3. Create svm, we use cv2.ml.SVM_create()function to create and set properties

4. Calculate Hog and prepare label

5. Training

6. Forecast

7. Drawing

import cv2
import numpy as np
import matplotlib.pyplot as plt

# 1.设置参数
PosNum = 820
NegNum = 1931
winSize = (64,128) 
blockSize = (16,16) # 105  
blockStride = (8,8)
cellSize = (8,8) 
Bin = 9 # 3780

#2.创建hog
hog = cv2.HOGDescriptor(winSize,blockSize,blockStride,cellSize,Bin)

#3.创建svm
svm = cv2.ml.SVM_create()
#svm属性设置
svm.setType(cv2.ml.SVM_C_SVC)
svm.setKernel(cv2.ml.SVM_LINEAR)
svm.setC(0.01) # 优化


#4.计算hog
featureNum = int(((128 - 16) / 8 + 1) * ((64 - 16) / 8 + 1) * 4 * 9) # 3780
featureArray = np.zeros((PosNum + NegNum, featureNum),np.float32)
labelArray = np.zeros((PosNum + NegNum, 1),np.int32)
# 处理正样本
for i in range(PosNum):
    filename = 'pos\\' + str(i + 1) + '.jpg'

    img = cv2.imread(filename)
    # 计算图像的hog特征, shape (3780,1)
    hist = hog.compute(img, (8,8)) # 第二个参数: winStride Window stride
    # 将该hog特征值存到featureArray里面
    featureArray[i] = hist.reshape(-1)
    labelArray[i] = 1
# 处理负样本
for i in range(PosNum, PosNum + NegNum):
    filename = 'neg\\' + str(i + 1 - PosNum) + '.jpg'

    img = cv2.imread(filename)
    # 计算图像的hog特征, shape (3780,1)
    hist = hog.compute(img, (8,8)) # 第二个参数: winStride Window stride
    # 将该hog特征值存到featureArray里面
    featureArray[i] = hist.reshape(-1)
    labelArray[i] = -1
# 5.训练
svm.train(featureArray,cv2.ml.ROW_SAMPLE, labelArray)

# 6.检测
alpha = np.zeros((1), np.float32)
rho = svm.getDecisionFunction(0, alpha) # 得到分类阙值
print(rho)
print(alpha)
alphaArray = np.zeros((1,1),np.float32)
supportVArray = np.zeros((1,featureNum), np.float32)
resultArray = np.zeros((1,featureNum), np.float32)
alphaArray[0,0] = alpha
resultArray = -1 * alphaArray * supportVArray

# 7.绘图
myDetect = np.zeros((3781), np.float32)
for i in range(3780):
    myDetect[i] = resultArray[0,i]
myDetect[3780] = rho[0]
# 构建hog
myHog = cv2.HOGDescriptor()
myHog.setSVMDetector(myDetect)


# 加载待检测图片
imageSrc = cv2.imread('test.jpg', 1)
cv2.imshow('img', imageSrc)

# 参数:(8,8)win滑动步长,(32,32)win大小,缩放系数 目标大小
objects = myHog.detectMultiScale(imageSrc, 0, (8,8), (32,32), 1.05, 2)
x = int(objects[0][0][0])
y = int(objects[0][0][1])
w = int(objects[0][0][2])
h = int(objects[0][0][3])
cv2.rectangle(imageSrc, (x,y),(x+w,y+h),(255,0,0),2)
cv2.imshow('img', imageSrc)
print(objects)
cv2.waitKey(0)
(0.2555259476741386, array([[1.]]), array([[0]], dtype=int32))
[0.]
(array([[  0,   0,  64, 128]], dtype=int32), array([[0.25552595]]))

Insert picture description here

Guess you like

Origin blog.csdn.net/qq_43328040/article/details/109299478