Learn OAK Step by Step Eight: Video Frame Stitching Through OAK Camera

Frame stitching is very useful in some scenarios, such as when feeding a large frame into a small-sized neural network. Larger frames can be split into multiple smaller frames and fed into the neural network.

Here we use 2 ImageManips to split the raw preview frame into two frames.

The content of the nodes involved

ColorCamera node

The ColorCamera node is the source of image frames. Its input and output are shown in the figure below:
insert image description here

ImageManip node

ImageManip nodes can be used to crop, rotate rectangular regions or perform various image transformations: rotate, mirror, flip, perspective transformation. Its input and output are shown in the figure below:
insert image description here

XLinkOut node

The XLinkOut node is used to send data from the device to the host via XLink. Its input and output are shown in the figure below:
insert image description here

Implementation steps

Setup 1: Create the file

  • Create a new 10-imageManip-tiling folder
  • Open the folder with vscode
  • Create a new main.py file

Setup 2: Install dependencies

Before installing dependencies, you need to create and activate a virtual environment. I have created a virtual environment OAKenv here, enter cd in the terminal... to return to the root directory of OAKenv, and enter to activate the virtual OAKenv\Scripts\activateenvironment

Install pip dependencies:

pip install numpy opencv-python depthai blobconverter --user

Setup 3: Import required packages

Import the packages required by the project in main.py

import cv2
import depthai as dai

Setup 4: Create the pipeline

pipeline = dai.Pipeline()

Setup 5: Create Nodes

Create a camera node

camRgb = pipeline.createColorCamera()
camRgb.setPreviewSize(1000, 500)
camRgb.setInterleaved(False)
maxFrameSize = camRgb.getPreviewHeight() * camRgb.getPreviewWidth() * 3
  • Create a ColorCamera (camRgb) and set its preview size to 1000x500 pixels. Set the data format of the camera to non-interleaved format (non-interlaced means that the pixel data is stored continuously in the order of RGB or BGR, rather than interleaved).

  • maxFrameSize = camRgb.getPreviewHeight() * camRgb.getPreviewWidth() * 3
    Calculate the maximum frame size. The maximum frame size is calculated by multiplying the preview height of the camRgb camera node by the preview width, and multiplying by 3 (representing 3 channels per pixel: red, green, blue). The final result will be stored in the variable maxFrameSize.

Create an ImageManip node and establish a connection

manip1 = pipeline.createImageManip()
manip1.initialConfig.setCropRect(0, 0, 0.5, 1)
manip1.setMaxOutputFrameSize(maxFrameSize)
camRgb.preview.link(manip1.inputImage)
  • Created an image processing node (manip1).

  • Set the cropping area of ​​the image processing node by calling the initialConfig.setCropRect() method. The parameters (0, 0, 0.5, 1) here represent the position and size of the clipping area. (0, 0) represents the coordinates of the upper left corner of the cropped area, 0.5 represents the ratio of the width of the cropped area to the width of the original image (here is half the width of the original image), and 1 represents the ratio of the height of the cropped area to the height of the original image (here is the full height of the original image).

  • Call the manip1.setMaxOutputFrameSize() method to set the maximum output frame size of the image processing node to the previously calculated maxFrameSize.

  • Connect the preview output of the camera node to the input of the image processing node by calling the camRgb.preview.link(manip1.inputImage) method. In this way, the image processing node will receive the preview image from the camera and perform predefined cropping and resizing operations.

manip2 = pipeline.createImageManip()
manip2.initialConfig.setCropRect(0.5, 0, 1, 1)
manip2.setMaxOutputFrameSize(maxFrameSize)
camRgb.preview.link(manip2.inputImage)

Created another image processing node (manip2).

  • Set the cropping area of ​​the image processing node by calling the manip2.initialConfig.setCropRect() method. The parameters (0.5, 0, 1, 1) here represent the position and size of the clipping area. (0.5, 0) represents the coordinates of the upper left corner of the cropped area, 1 represents the ratio of the width of the cropped area to the width of the original image (here, the part after half the width of the original image), and 1 represents the ratio of the height of the cropped area to the height of the original image (here the full height of the original image).

  • Call the manip2.setMaxOutputFrameSize() method to set the maximum output frame size of the image processing node to the previously calculated maxFrameSize.

  • Connect the preview output of the camera node to the input of the second image processing node by calling the camRgb.preview.link(manip2.inputImage) method. In this way, the second image processing node will receive the preview image from the camera and perform predefined cropping and resizing operations.

Create XLinkOut and establish connection

xout1 = pipeline.create(dai.node.XLinkOut)
xout1.setStreamName('out1')
manip1.out.link(xout1.input)

Create an XLinkOut node, name it xout1, and set its output stream name to 'out1'. Then, link the output of the manip1 node to the input of the xout1 node. In this way, the image data processed by the manip1 node will be output through the xout1 node.

xout2 = pipeline.create(dai.node.XLinkOut)
xout2.setStreamName('out2')
manip2.out.link(xout2.input)

Create a second XLinkOut node, name it xout2, and set its output stream name to 'out2'. Then, link the output of the manip2 node to the input of the xout2 node. In this way, the image data processed by the manip2 node will be output through the xout2 node.

Setup 6: Connect the device and start the pipeline

with dai.Device(pipeline) as device:

Setup 7: Create input queue and output queue to communicate with DepthAI device

    q1 = device.getOutputQueue(name="out1", maxSize=4, blocking=False)
    q2 = device.getOutputQueue(name="out2", maxSize=4, blocking=False)

Create two output queues q1 and q2.

The first output queue q1 is created by the getOutputQueue method of the device, and the parameter name="out1" is passed in to indicate that the output stream named "out1" is to be obtained. At the same time, the maximum size of the queue is specified to be 4, and blocking=False means that adding new elements will not be blocked when the queue is full.

The second output queue q2 is also created by the getOutputQueue method of the device, and the input parameter name="out2" means to obtain the output stream named "out2". Similarly, the maximum size of the specified queue is 4, and blocking=False means that adding new elements will not be blocked when the queue is full.

Setup 8: Main loop

    while True:

Get frame data from output queues q1 and q2 and display in the window

        if q1.has():
            cv2.imshow("Tile 1", q1.get().getCvFrame())

        if q2.has():
            cv2.imshow("Tile 2", q2.get().getCvFrame())

        if cv2.waitKey(1) == ord('q'):
            break

Use q1.has()and q2.has()check output queues q1 and q2 for available frame data.
If there is frame data available in the output queue q1, use it q1.get().getCvFrame()to get the latest frame and use OpenCV's cv2.imshow()method to display it in a window named "Tile 1".

If the output queue q2 has frame data available, use it q2.get().getCvFrame()to get the latest frame and use OpenCV's cv2.imshow()method to display it in a window named "Tile 2".

Use cv2.waitKey(1)Wait for the user to press a key on the keyboard. If the detected key is the character 'q', exit the loop and end the program.

Setup 9: Run the program

Enter the following command in the terminal to run the program

python main.py

The effect after running is as follows:
insert image description here

Guess you like

Origin blog.csdn.net/w137160164/article/details/131456402