openCV practice project: drag and drop virtual blocks

1. Project effect:

The school dormitory moved today, I was tired, and suddenly I found that the display handling was also very rough, so that's it, hehe~~~

Second, the core process:

1. openCV reads the video stream and draws a rectangle on each frame of the picture.

2. Use mediapipe to get the coordinates of the key points of the finger.

3. According to the coordinate position of the finger and the coordinate position of the rectangle, determine whether the finger point is on the rectangle, and if so, the rectangle moves with the finger.

3. Code process:

Environment preparation:

  • python: 3.8.8
  • opencv: 4.2.0.32
  • mediapipe: 0.8.10.1

Note:

1. If the opencv version is too high or too low, there may be some problems such as the camera cannot be turned on, flashback and so on. The python version affects the selectable version of opencv.

2. After pip install mediapipe, openCV may not be used normally. Uninstall it and download it again, just get used to it.

1. Read the camera video and draw a rectangle:

import cv2
import time
import numpy as np


# 调用摄像头 0 默认摄像头 
cap = cv2.VideoCapture(0)

# 初始方块数据
x = 100
y = 100
w = 100
h = 100

# 读取一帧帧照片
while True:
    # 返回frame图片
    rec,frame = cap.read()
    
    # 镜像
    frame = cv2.flip(frame,1)
    
    # 画矩形 
    cv2.rectangle(frame, (x, y), (x + w, y + h), (255, 0, 255), -1)

    # 显示画面
    cv2.imshow('frame',frame)
    
    # 退出条件
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
    
cap.release()
cv2.destroyAllWindows() 

This is a very basic one-step operation. At this time, we run this code, the camera is turned on, and we will be surprised to see his handsome face, and there is a 100*100 purple rectangle in the upper left corner.

2. Import mediapipe to process finger coordinates

pip install mediapipe

There may be some problems at this time, such as openCV suddenly can not be used, it doesn't matter, uninstall it and download it again.

mediapipe details: Hands - mediapipe (google.github.io)

In short, it will return us the coordinates of 21 key points of the finger, that is, its position ratio in the video screen (0~1), we multiply the width and height of the corresponding screen to get the coordinates corresponding to the finger.

This time I use the tips of my index and middle fingers, which are 8 and 12.

2.1 Configure some basic information:

import cv2
import time
import numpy as np
import mediapipe as mp


mp_drawing = mp.solutions.drawing_utils
mp_drawing_styles = mp.solutions.drawing_styles
mp_hands = mp.solutions.hands

hands =  mp_hands.Hands(
    static_image_mode=True,
    max_num_hands=2,
    min_detection_confidence=0.5)

2.2 When processing each frame of image, add:

    frame.flags.writeable = False
    frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
    # 返回结果
    results = hands.process(frame)

    frame.flags.writeable = True
    frame = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)

When we read each frame of picture in the video stream, convert it from BGR to RGB for the hands object generated by mediapipe to read, it will return the information of the key points of the finger in this picture, we just need to continue painting it , drawn on each frame of the picture.

    # 如果结果不为空
    if results.multi_hand_landmarks:

        # 遍历双手(根据读取顺序,一只只手遍历、画画)
        for hand_landmarks in results.multi_hand_landmarks:
            mp_drawing.draw_landmarks(
                frame,
                hand_landmarks,
                mp_hands.HAND_CONNECTIONS,
                mp_drawing_styles.get_default_hand_landmarks_style(),
                mp_drawing_styles.get_default_hand_connections_style())

2.3 Complete code for this step

import cv2
import time
import numpy as np
import mediapipe as mp


mp_drawing = mp.solutions.drawing_utils
mp_drawing_styles = mp.solutions.drawing_styles
mp_hands = mp.solutions.hands

hands =  mp_hands.Hands(
    static_image_mode=True,
    max_num_hands=2,
    min_detection_confidence=0.5)


# 调用摄像头 0 默认摄像头 
cap = cv2.VideoCapture(0)

# 方块初始数组
x = 100
y = 100
w = 100
h = 100


# 读取一帧帧照片
while True:
    # 返回frame图片
    rec,frame = cap.read()
    
    # 镜像
    frame = cv2.flip(frame,1)
    
    
    
    frame.flags.writeable = False
    frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
    # 返回结果
    results = hands.process(frame)

    frame.flags.writeable = True
    frame = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
    
    
    # 如果结果不为空
    if results.multi_hand_landmarks:

        # 遍历双手(根据读取顺序,一只只手遍历、画画)
        # results.multi_hand_landmarks n双手
        # hand_landmarks 每只手上21个点信息
        for hand_landmarks in results.multi_hand_landmarks:
            mp_drawing.draw_landmarks(
                frame,
                hand_landmarks,
                mp_hands.HAND_CONNECTIONS,
                mp_drawing_styles.get_default_hand_landmarks_style(),
                mp_drawing_styles.get_default_hand_connections_style())
    
    
    # 画矩形 
    cv2.rectangle(frame, (x, y), (x + w, y + h), (255, 0, 255), -1)

    # 显示画面
    cv2.imshow('frame',frame)
    
    # 退出条件
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
    
cap.release()
cv2.destroyAllWindows() 

At this point, it's interesting to run and see:

3. Position calculation

Our experiment requires dragging the square, and there must be times when it is not dragged. Therefore, we might as well obtain the position of the index finger (8) and the middle finger (12) according to the previous step . If the two are close, we will be between him and the square. When overlapping, the coordinates of the block are changed according to the position of the finger.

Full code:

import cv2
import time
import math
import numpy as np
import mediapipe as mp

# mediapipe配置
mp_drawing = mp.solutions.drawing_utils
mp_drawing_styles = mp.solutions.drawing_styles
mp_hands = mp.solutions.hands
hands =  mp_hands.Hands(
    static_image_mode=True,
    max_num_hands=2,
    min_detection_confidence=0.5)


# 调用摄像头 0 默认摄像头 
cap = cv2.VideoCapture(0)

# cv2.namedWindow("frame", 0)
# cv2.resizeWindow("frame", 960, 640)


# 获取画面宽度、高度
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))


# 方块初始数组
x = 100
y = 100
w = 100
h = 100

L1 = 0
L2 = 0

on_square = False
square_color = (0, 255, 0)

# 读取一帧帧照片
while True:
    # 返回frame图片
    rec,frame = cap.read()
    
    # 镜像
    frame = cv2.flip(frame,1)
    
    
    
    frame.flags.writeable = False
    frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
    # 返回结果
    results = hands.process(frame)

    frame.flags.writeable = True
    frame = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
    
    
    # 如果结果不为空
    if results.multi_hand_landmarks:


        # 遍历双手(根据读取顺序,一只只手遍历、画画)
        # results.multi_hand_landmarks n双手
        # hand_landmarks 每只手上21个点信息
        for hand_landmarks in results.multi_hand_landmarks:
            mp_drawing.draw_landmarks(
                frame,
                hand_landmarks,
                mp_hands.HAND_CONNECTIONS,
                mp_drawing_styles.get_default_hand_landmarks_style(),
                mp_drawing_styles.get_default_hand_connections_style())
            
            # 记录手指每个点的x y 坐标
            x_list = []
            y_list = []
            for landmark in hand_landmarks.landmark:
                x_list.append(landmark.x)
                y_list.append(landmark.y)
                
            
            # 获取食指指尖
            index_finger_x, index_finger_y = int(x_list[8] * width),int(y_list[8] * height)

            # 获取中指
            middle_finger_x,middle_finger_y = int(x_list[12] * width), int(y_list[12] * height)


            # 计算两指尖距离
            finger_distance = math.hypot((middle_finger_x - index_finger_x), (middle_finger_y - index_finger_y))

            # 如果双指合并(两之间距离近)
            if finger_distance < 60:

                # X坐标范围 Y坐标范围
                if (index_finger_x > x and index_finger_x < (x + w)) and (
                        index_finger_y > y and index_finger_y < (y + h)):

                    if on_square == False:
                        L1 = index_finger_x - x
                        L2 = index_finger_y - y
                        square_color = (255, 0, 255)
                        on_square = True

            else:
                # 双指不合并/分开
                on_square = False
                square_color = (0, 255, 0)

            # 更新坐标
            if on_square:
                x = index_finger_x - L1
                y = index_finger_y - L2
            
            

    # 图像融合 使方块不遮挡视频图片
    overlay = frame.copy()
    cv2.rectangle(frame, (x, y), (x + w, y + h), square_color, -1)
    frame = cv2.addWeighted(overlay, 0.5, frame, 1 - 0.5, 0)
    

    # 显示画面
    cv2.imshow('frame',frame)
    
    # 退出条件
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
    
cap.release()
cv2.destroyAllWindows() 

Guess you like

Origin blog.csdn.net/suic009/article/details/126534975