Opencv from beginner to proficient - reading and saving pictures, videos and cameras

Introduction

OpenCV is a popular open source computer vision library developed by Intel Corporation. It provides more than 2500 optimization algorithms and many toolkits for image processing and computer vision applications such as grayscale, color, depth, feature-based and motion tracking. OpenCV is mainly written in C++ language, but also supports Python, Java, C and other languages. Due to its open source and widely used characteristics, it has been widely used in the fields of computer vision and machine learning.

1. Image in the eyes of a computer

An RGB image is a color image consisting of three color channels: red (R), green (G), and blue (B). Each pixel has three values, representing its brightness in the red, green and blue channels. The value of each number is (0-255), and the combination of three different values forms a pixel.

These three channels together make up the RGB image, and their combination forms the color of each pixel. By adjusting the brightness and color distribution of each channel, you can change the color and appearance of your image.

import cv2
import numpy as np
 
# 读取图片
image = cv2.imread('image/1.jpg')
#   打印图片的形状，即高宽和通道数
h, w, c = image.shape
print(h, w, c)
 
#  打印（60,60）的像素点的rgb值
pixel = image[60, 60]
print(pixel)
 
#  创建一个空数组和图像格式大小相同
pixels = np.zeros((h, w, c), dtype=np.uint8)
# 遍历每个像素点
for y in range(h):
    for x in range(w):
        # 获取像素点的数值
        pixel = image[y, x]
        # 将像素点的数值存储到新数组中
        pixels[y, x] = pixel
 
# 输出结果
print(pixels)

The printing result is as shown below. This is the structure of a picture in the eyes of the computer. During the actual operation, breakpoints can be used to print sequentially.

Of course, the above is to better see the essence of the picture. We can directly use an array to complete the above operations.

import cv2
import numpy as np
 
# 读取图片
image = cv2.imread('image.jpg')
 
# 将图像转换为NumPy数组
pixels = np.array(image)
 
# 输出结果
print(pixels)

2. Reading, displaying and saving pictures

import cv2
 
# 读取图片并转为灰度图
# image = cv2.imread('image/1.jpg')
image = cv2.imread('image/1.jpg', cv2.IMREAD_GRAYSCALE)
# 显示图片窗口，并命名为 'IMG'
cv2.imshow('IMG', image)
 
# 保存到image路径下并命名为jujingyi
cv2.imwrite('image/jujingyi.jpg', image)
# 等待键盘输入，参数为0表示一直等待，直到按下任意键
cv2.waitKey(0)
 
# 关闭所有打开的窗口
cv2.destroyAllWindows()

The cv2.imread() function is used to read images

The cv2.imwrite() function is used to save images

cv2.waitKey(0) 0 means press any key to stop, 1000 means close the window after 1000 milliseconds.

3. Video reading and display

import cv2
 
cap = cv2.VideoCapture(0)
 
while True:
    success, image = cap.read()
    cv2.imshow('IMG', image)
    
    # 等待1毫秒，检测键盘输入
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
 
# 循环结束后释放摄像头资源和关闭窗口
cap.release()
cv2.destroyAllWindows()

cv2.VideoCapture(0) is set to 0 to use the camera that comes with the computer. If you use a peripheral, select 1 or 2 to check the number of your peripheral camera in the computer. You can also enter the video address to read the specified video.

Use a while loop to iterate through each frame of image read by the camera and store it in the image.

We used the cv2.waitKey(1) function to wait for 1 millisecond and detect keyboard input. Use the bitwise operator & and the ord() function to compare characters entered from the keyboard with the ASCII character 'q' (that is, the 'q' key is pressed). If equal, exit the loop via break statement.

When exiting the loop, we need to release the camera resources and close the window. Use cap.release() to release the camera resources, and then call cv2.destroyAllWindows() to close the display window.

In this way, when the "q" key on the keyboard is pressed, the program will exit the loop, release the camera resources and close the window.

If we want to save the pictures recorded by our camera

import cv2
 
cap = cv2.VideoCapture(0)
 
# 设置保存视频的参数
save_width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
save_height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
fourcc = cv2.VideoWriter_fourcc(*'XVID')
out = cv2.VideoWriter('output.avi', fourcc, 20.0, (save_width, save_height))
 
while True:
    success, image = cap.read()
    cv2.imshow('IMG', image)
    
    # 保存每一帧图像到视频文件
    out.write(image)
    
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
 
cap.release()
out.release()
cv2.destroyAllWindows()

We set the relevant parameters for video saving according to the parameters of the camera: the width and height of the saved video are the same as the parameters of the camera, use four-character code (fourcc) to define the video codec as XVID, set the frame rate to 20.0, and specify the saved Video size.

Before entering the loop, we use the cv2.VideoWriter() function to create an object to save the video. Among them, the first parameter is the saved video file name, the second parameter is the specified video codec, the third parameter is the frame rate, and the fourth parameter is the saved video size.

In the loop, each frame of image will be saved to the video file, which is implemented through out.write(image).

Finally, after exiting the loop, we need to release the camera resources and close the video object. Use cap.release() to release camera resources, and out.release() to close the video object.

In this way, when the "q" key on the keyboard is pressed, the program will exit the loop and save the data read by the camera as the video file "output.avi".

in

The cv2.VideoWriter() function is used to create an object for saving videos. Its parameters are explained below:

filename: saved video file name. The 'output.avi' here is the file name of the saved video, which can be changed as needed.

fourcc: video codec. fourcc is a four-character code used to specify the codec of the video. Common four-character codes include MP4V, XVID, MJPG, etc., which can be selected according to needs. In the sample code, we used *'XVID' to indicate using the XVID codec.

fps: Frames per Second. fps represents the frame rate when saving the video, that is, the number of frames played per second. In the sample code, we set the frame rate to 20.0, which can be adjusted as needed.

frameSize: video size. frameSize is the size when saving the video, that is, the size of each frame of image. In the sample code, we set the dimensions to (save_width, save_height), where save_width and save_height are the width and height obtained according to the parameters of the camera

If you want to save the video file in MP4 format, you can modify the fourcc parameter to a codec suitable for MP4 format

fourcc = cv2.VideoWriter_fourcc(*'mp4v')
out = cv2.VideoWriter('output.mp4', fourcc, 20.0, (save_width, save_height))

In the above code, we used *'mp4v' as the fourcc parameter, which means to use the MP4 codec. At the same time, change the saved file name to 'output.mp4'.

After this modification, the data read by the camera will be saved in MP4 format. Please make sure your version of OpenCV supports this codec, otherwise errors may occur.