OpenCV video manipulation

1. Video reading and writing

1. Read the video from the file and play it

To get a video in OpenCV, we need to create a VideoCapture object and specify the video file you want to read:

(1) Create an object to read video

cap = cv.VideoCapture(filepath)

parameter:

  • filepath: video file path

(2) Attribute information of the video

Get some properties of the video

retval = cap.get(propId)

parameter:

  • propId: a number from 0 to 18, each number represents the property of the video

Common attributes are:

insert image description here

Modify the attribute information of the video

cap.set(propId,value)

parameter:

  • proid: the index of the attribute, corresponding to the table above
  • value: the modified attribute value

(3) Determine whether the image is read successfully

isornot = cap.isOpened()

Returns true if the read is successful, otherwise returns False

(4) Get a frame image of the video

ret, frame = cap.read()

parameter:

  • ret: If the acquisition is successful, return True, if the acquisition fails, return False
  • Frame: The image of a certain frame obtained

(5) display image

Call cv.imshow() to display the image, use cv.waitkey() to set the appropriate duration when displaying the image, if it is too low, the video will play very fast, if it is too high, it will play very slowly, usually we set 25ms is fine.

(6) release video

Finally, call cap.realease() to release the video

Example:

import numpy as np
import cv2 as cv
# 1.获取视频对象
cap = cv.VideoCapture('DOG.wmv')
# 2.判断是否读取成功
while(cap.isOpened()):
    # 3.获取每一帧图像
    ret, frame = cap.read()
    # 4. 获取成功显示图像
    if ret == True:
        cv.imshow('frame',frame)
    # 5.每一帧间隔为25ms
    if cv.waitKey(25) & 0xFF == ord('q'):
        break
# 6.释放视频对象
cap.release()
cv.destoryAllwindows()

2. Save the video

In OpenCV, we use the VedioWriter object to save the video, and specify the name of the output file in it, as follows:

(1) Create an object for video writing

out = cv2.VideoWriter(filename,fourcc, fps, frameSize)

parameter:

  • filename: the location where the video is saved
  • fourcc: A 4-byte code specifying the video codec
  • fps: frame rate
  • frameSize: frame size

(2) Set the codec of the video as shown below,

retval = cv2.VideoWriter_fourcc( c1, c2, c3, c4 )

parameter:

  • c1,c2,c3,c4: are the 4-byte codes of the video codec, find the list of available codes in fourcc.org, closely related to the platform, commonly used are:
  • In Windows: DIVX (.avi)
  • In OS: MJPG (.mp4), DIVX (.avi), X264 (.mkv).

(3) Use cap.read() to get each frame of image in the video, and use out.write() to write a frame of image into the video.

(4) Use cap.release() and out.release() to release resources.

Example:

import cv2 as cv
import numpy as np

# 1. 读取视频
cap = cv.VideoCapture("DOG.wmv")

# 2. 获取图像的属性(宽和高,),并将其转换为整数
frame_width = int(cap.get(3))
frame_height = int(cap.get(4))

# 3. 创建保存视频的对象,设置编码格式,帧率,图像的宽高等
out = cv.VideoWriter('outpy.avi',cv.VideoWriter_fourcc('M','J','P','G'), 10, (frame_width,frame_height))
while(True):
    # 4.获取视频中的每一帧图像
    ret, frame = cap.read()
    if ret == True: 
        # 5.将每一帧图像写入到输出文件中
        out.write(frame)
    else:
        break 

# 6.释放资源
cap.release()
out.release()
cv.destroyAllWindows()

2. Video Tracking

1. meanshift

1.1 Principle

The principle of the meanshift algorithm is very simple. Suppose you have a bunch of point sets and a small window. This window may be circular. Now you may want to move this window to the area with the highest density of point sets.

As shown below:

insert image description here

The first window is the area of ​​the blue circle, named C1. The center of the blue circle is marked with a blue rectangle, named C1_o.

The center of mass formed by the point sets of all points in the window is at the blue circular point C1_r, obviously the centroid and the center of mass of the ring do not coincide. So, move the blue window so that the centroid coincides with the centroid obtained earlier. Find the centroid of the point set enclosed in the circle again in the area of ​​the newly moved circle, and then move again. Usually, the centroid and the centroid do not coincide. Continue to perform the above moving process until the centroid and the centroid roughly coincide. In this way, the final circular window will fall to the place where the pixel distribution is the largest, that is, the green circle in the figure, named C2.

In addition to being used in video tracking, the meanshift algorithm has important applications in various occasions involving data and unsupervised learning such as clustering and smoothing. It is a widely used algorithm.

An image is a matrix of information. How to use the meanshift algorithm to track a moving object in a video? The general process is as follows:

  1. First select a target area on the image

  2. Computes the histogram distribution of the selected region, typically a histogram of the HSV color space.

  3. The histogram distribution is also calculated for the next frame of image b.

  4. Calculate the area in image b that is most similar to the histogram distribution of the selected area, and use the meanshift algorithm to move the selected area along the most similar part until the most similar area is found, and the target tracking in image b is completed.

  5. Repeat the process from 3 to 4 to complete the entire video target tracking.

Usually we use the image obtained by histogram back projection and the starting position of the target object in the first frame. When the movement of the target object is reflected in the histogram back projection, the meanshift algorithm will move our window to the back projection The area with the highest gray density in the projected image is projected. As shown below:

insert image description here

The process of histogram backprojection is:

Suppose we have a 100x100 input image and a 10x10 template image, the search process is as follows:

  1. Starting from the upper left corner (0,0) of the input image, cut a temporary image from (0,0) to (10,10);
  2. Generate a histogram of the temporary image;
  3. Use the histogram of the temporary image to compare with the histogram of the template image, and record the comparison result as c;
  4. The histogram comparison result c is the pixel value at (0,0) of the result image;
  5. Cut the temporary image of the input image from (0,1) to (10,11), compare the histogram, and record it to the result image;
  6. Repeat steps 1 to 5 until the lower right corner of the input image, forming the back projection of the histogram.

1.2 Implementation

The API for implementing Meanshift in OpenCV is:

cv.meanShift(probImage, window, criteria)

parameter:

  • probImage: ROI area, the backprojection of the histogram of the target
  • window: The initial search window is the rect that defines the ROI
  • Criteria: Determine the criteria for stopping the window search, mainly including the number of iterations reaching the set maximum value, the drift value of the window center being greater than a set limit, etc.

The main process of implementing Meanshift is:

  1. Read video file: cv.videoCapture()
  2. Area of ​​interest setting: Get the first frame of image and set the target area, that is, the area of ​​interest
  3. Calculate histogram: Calculate the HSV histogram of the region of interest and normalize it
  4. Target tracking: set the window search stop condition, back-project the histogram, perform target tracking, and draw a rectangular frame at the target position.

Example:

import numpy as np
import cv2 as cv
# 1.获取图像
cap = cv.VideoCapture('DOG.wmv')

# 2.获取第一帧图像,并指定目标位置
ret,frame = cap.read()
# 2.1 目标位置(行,高,列,宽)
r,h,c,w = 197,141,0,208  
track_window = (c,r,w,h)
# 2.2 指定目标的感兴趣区域
roi = frame[r:r+h, c:c+w]

# 3. 计算直方图
# 3.1 转换色彩空间(HSV)
hsv_roi =  cv.cvtColor(roi, cv.COLOR_BGR2HSV)
# 3.2 去除低亮度的值
# mask = cv.inRange(hsv_roi, np.array((0., 60.,32.)), np.array((180.,255.,255.)))
# 3.3 计算直方图
roi_hist = cv.calcHist([hsv_roi],[0],None,[180],[0,180])
# 3.4 归一化
cv.normalize(roi_hist,roi_hist,0,255,cv.NORM_MINMAX)

# 4. 目标追踪
# 4.1 设置窗口搜索终止条件:最大迭代次数,窗口中心漂移最小值
term_crit = ( cv.TERM_CRITERIA_EPS | cv.TERM_CRITERIA_COUNT, 10, 1 )

while(True):
    # 4.2 获取每一帧图像
    ret ,frame = cap.read()
    if ret == True:
        # 4.3 计算直方图的反向投影
        hsv = cv.cvtColor(frame, cv.COLOR_BGR2HSV)
        dst = cv.calcBackProject([hsv],[0],roi_hist,[0,180],1)

        # 4.4 进行meanshift追踪
        ret, track_window = cv.meanShift(dst, track_window, term_crit)

        # 4.5 将追踪的位置绘制在视频上,并进行显示
        x,y,w,h = track_window
        img2 = cv.rectangle(frame, (x,y), (x+w,y+h), 255,2)
        cv.imshow('frame',img2)

        if cv.waitKey(60) & 0xFF == ord('q'):
            break        
    else:
        break
# 5. 资源释放        
cap.release()
cv.destroyAllWindows()

The following are the tracking results of the three frames of images:

insert image description here

2. Camshift

Please carefully look at the above results, there is a problem, that is, the size of the detection window is fixed, and the dog is a process of gradually shrinking from near to far, and a fixed window is not suitable. So we need to correct the size and angle of the window according to the size and angle of the target. CamShift can help us solve this problem.

The full name of the CamShift algorithm is "Continuously Adaptive Mean-Shift" (continuously adaptive MeanShift algorithm), which is an improved algorithm for the MeanShift algorithm. It can adjust the size of the search window in real time as the size of the tracking target changes, and has a better tracking effect.

The Camshift algorithm first applies a meanshift, and once the meanshift converges, it updates the size of the window and also calculates the orientation of the best-fit ellipse, thereby updating the search window according to the location and size of the target. As shown below:

insert image description here

When Camshift is implemented in OpenCV, just change the above meanshift function to Camshift function:

In Camshift:

 # 4.4 进行meanshift追踪
        ret, track_window = cv.meanShift(dst, track_window, term_crit)

        # 4.5 将追踪的位置绘制在视频上,并进行显示
        x,y,w,h = track_window
        img2 = cv.rectangle(frame, (x,y), (x+w,y+h), 255,2)

Change to:

  #进行camshift追踪
    ret, track_window = cv.CamShift(dst, track_window, term_crit)

        # 绘制追踪结果
        pts = cv.boxPoints(ret)
        pts = np.int0(pts)
        img2 = cv.polylines(frame,[pts],True, 255,2)

3. Algorithm summary

Both the meanshift and camshift algorithms have their own advantages, and naturally they also have disadvantages:

  • Meanshift algorithm: Simple, with fewer iterations, but it cannot solve the occlusion problem of the target and cannot adapt to the shape and size changes of the moving target.

  • Camshift algorithm: It can adapt to the change of the size and shape of the moving target, and has a good tracking effect, but when the background color and the target color are close, it is easy to make the target area larger, which may eventually lead to the loss of target tracking.

Guess you like

Origin blog.csdn.net/mengxianglong123/article/details/125933386