Supervision 结合 YOLO V8 玩爆各种计算机视觉处理任务

一、Supervision 介绍

Supervision是一个针对于计算机视觉各种任务的框架工具,为用户了提供便捷高效的视觉处理方法,可以轻松处理数据集或直观地展现检测结果。另外他还提供了多种绘制检测结果的方式,并且还提供了统计特定区域内检测跟踪、越线数量统计、切片推理、轨迹平滑等不同计算机视觉任务的方法封装。可谓是在CV领域,有了它你可以少写很多展现和计算代码。

在这里插入图片描述

Github 地址:https://github.com/roboflow/supervision

官方介绍及文档地址:https://supervision.roboflow.com/latest/

YOLO V8 支持全方位的视觉 AI 任务,包括检测、分割、姿态估计、跟踪和分类。并且在速度和准确性方面具有无与伦比的性能。能够应用在各种对速度和精度要求较高的应用和领域。

关于 YOLO V8 的使用,可以参考下面这篇文章:

基于 YOLO V8 Fine-Tuning 训练自定义的目标检测模型

下面将实验使用 YOLO V8 作为算法处理层,提供 CV 检测任务,使用 Supervision 进行上层的展现和不同任务的处理。

安装 Supervision

前提:Supervision 需要 Python>=3.8 环境中安装,所以注意你环境中的 Python 版本是否符合。

如果需要更加轻量级,在服务器端应用程序运行,则可以直接 Supervision

pip install supervision -i https://pypi.tuna.tsinghua.edu.cn/simple

如果需要带有 GUI 支持的完整版本,可以指定为 supervision[desktop],此版本会包含 OpenCV 组件:

pip install "supervision[desktop]" -i https://pypi.tuna.tsinghua.edu.cn/simple

二、目标追踪检测任务

目标追踪指在视频序列中连续地识别和定位一个或多个特定目标,以及目标的行动轨迹。Supervision 提供了 ByteTrack 类,可以帮助我们快速的实现该功能。

首先下载任务视频,Supervision 提供了不同的示例任务视频,以方便我们快速实验效果:

from supervision.assets import download_assets, VideoAssets
video_name = download_assets(VideoAssets.PEOPLE_WALKING)
print(video_name)

执行后可以看到下载后的视频:

在这里插入图片描述

实现过程如下:

import cv2
import supervision as sv
from ultralytics import YOLO
from supervision.assets import download_assets, VideoAssets

# 如果不存在则下载视频
video_name = download_assets(VideoAssets.PEOPLE_WALKING)
# 加载 YoloV8n 模型,如果不存在会自动下载
model = YOLO("yolov8n.pt")
# 追踪器
tracker = sv.ByteTrack()
# 初始化展现对象
corner_annotator = sv.BoxCornerAnnotator(
    corner_length=15,
    thickness=2,
    color=sv.Color(r=255, g=255, b=0)
)
label_annotator = sv.LabelAnnotator()
trace_annotator = sv.TraceAnnotator(
    trace_length=100,
    color=sv.Color(r=255, g=0, b=0)
)
# 读取视频
cap = cv2.VideoCapture(video_name)
# 初始化计时器
prev_tick = cv2.getTickCount()
while True:
    # 循环读取每一帧
    ret, frame = cap.read()
    # 由于原视频帧比较大,方便处理和后面展现,缩小一些
    frame = cv2.resize(frame, (1280, 720))
    if not ret:
        break
    result = model(
        frame,
        conf=0.5,
        device=[0]  # 如果是 cpu 则是 device='cpu'
    )[0]
    detections = sv.Detections.from_ultralytics(result)
    # 目标跟踪
    detections = tracker.update_with_detections(detections)
    labels = [f"{
      
      tracker_id} {
      
      result.names[class_id]}" for class_id, tracker_id in
              zip(detections.class_id, detections.tracker_id)]
    # 绘制边框
    frame = corner_annotator.annotate(frame, detections=detections)
    # 绘制标签
    frame = label_annotator.annotate(frame, detections=detections, labels=labels)
    # 绘制轨迹
    frame = trace_annotator.annotate(frame, detections=detections)

    # 计算帧率
    current_tick = cv2.getTickCount()
    fps = cv2.getTickFrequency() / (current_tick - prev_tick)
    prev_tick = current_tick
    cv2.putText(frame, "FPS: {:.2f}".format(fps), (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)

    cv2.imshow('video', frame)
    cv2.waitKey(1)

运行效果:

在这里插入图片描述

三、追踪检测轨迹平滑

直接从追踪算法输出的结果往往包含噪声和不连续性,可能会导致轨迹不平滑、不自然,在 Supervision 中为我们提供了 DetectionsSmoother 类,可以帮助我们快速的实现该功能。

实现过程如下:

import cv2
import supervision as sv
from ultralytics import YOLO
from supervision.assets import download_assets, VideoAssets

# 如果不存在则下载视频
video_name = download_assets(VideoAssets.PEOPLE_WALKING)
# 加载 YoloV8n 模型,如果不存在会自动下载
model = YOLO("yolov8n.pt")
# 追踪器
tracker = sv.ByteTrack()
# 轨迹平滑器
smoother = sv.DetectionsSmoother(length=4)
# 初始化展现对象
corner_annotator = sv.BoxCornerAnnotator(
    corner_length=15,
    thickness=2,
    color=sv.Color(r=255, g=255, b=0)
)
label_annotator = sv.LabelAnnotator()
trace_annotator = sv.TraceAnnotator(
    trace_length=100,
    color=sv.Color(r=255, g=0, b=0)
)
# 读取视频
cap = cv2.VideoCapture(video_name)
# 初始化计时器
prev_tick = cv2.getTickCount()
while True:
    # 循环读取每一帧
    ret, frame = cap.read()
    # 由于原视频帧比较大,方便处理和后面展现,缩小一些
    frame = cv2.resize(frame, (1280, 720))
    if not ret:
        break
    result = model(
        frame,
        device=[0]  # 如果是 cpu 则是 device='cpu'
    )[0]
    detections = sv.Detections.from_ultralytics(result)
    # 目标跟踪
    detections = tracker.update_with_detections(detections)
    # 轨迹平滑
    detections = smoother.update_with_detections(detections)

    labels = [f"{
      
      tracker_id} {
      
      result.names[class_id]}" for class_id, tracker_id in
              zip(detections.class_id, detections.tracker_id)]
    # 绘制边框
    frame = corner_annotator.annotate(frame, detections=detections)
    # 绘制标签
    frame = label_annotator.annotate(frame, detections=detections, labels=labels)
    # 绘制轨迹
    frame = trace_annotator.annotate(frame, detections=detections)

    # 计算帧率
    current_tick = cv2.getTickCount()
    fps = cv2.getTickFrequency() / (current_tick - prev_tick)
    prev_tick = current_tick
    cv2.putText(frame, "FPS: {:.2f}".format(fps), (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)

    cv2.imshow('video', frame)
    cv2.waitKey(1)

运行效果:

在这里插入图片描述

四、越线数据统计任务

通过在视频中预设界限,对越线的目标进行 +1 统计,常用在交通监控、安防监控等领域,Supervision 提供了 LineZone 类,可以帮助我们快速的实现该功能。

实现过程如下:

import cv2
import supervision as sv
from ultralytics import YOLO
from supervision.assets import download_assets, VideoAssets

# 如果不存在则下载视频
video_name = download_assets(VideoAssets.VEHICLES)
# 加载 YoloV8n 模型,如果不存在会自动下载
model = YOLO("yolov8n.pt")

# 预设界限
start = sv.Point(0, 400)
end = sv.Point(1280, 400)
# 初始预线检测器
line_zone = sv.LineZone(
    start=start,
    end=end
)
# 追踪器
tracker = sv.ByteTrack()
# 初始化展现对象
trace_annotator = sv.TraceAnnotator()
label_annotator = sv.LabelAnnotator(
    text_scale=1
)
line_zone_annotator = sv.LineZoneAnnotator(
    thickness=1,
    text_thickness=1,
    text_scale=1
)

# 读取视频
cap = cv2.VideoCapture(video_name)
# 初始化计时器
prev_tick = cv2.getTickCount()
while True:
    # 循环读取每一帧
    ret, frame = cap.read()
    if not ret:
        break
    # 由于原视频帧比较大,方便处理和后面展现,缩小一些
    frame = cv2.resize(frame, (1280, 720))
    result = model(
        frame,
        device=[0]  # 如果是 cpu 则是 device='cpu'
    )[0]
    detections = sv.Detections.from_ultralytics(result)
    # 目标跟踪
    detections = tracker.update_with_detections(detections)
    # 更新预线检测器
    crossed_in, crossed_out = line_zone.trigger(detections)
    print(f'in:{
      
      line_zone.in_count}', f'out:{
      
      line_zone.out_count}')
    # 获得各边界框的标签
    labels = [
        f"{
      
      result.names[class_id]}"
        for class_id, tracker_id
        in zip(detections.class_id, detections.tracker_id)
    ]
    # 绘制轨迹
    frame = trace_annotator.annotate(frame, detections=detections)
    # 绘制标签
    frame = label_annotator.annotate(frame, detections=detections, labels=labels)
    # 绘制预制线
    frame = line_zone_annotator.annotate(frame, line_counter=line_zone)

    # 计算帧率
    current_tick = cv2.getTickCount()
    fps = cv2.getTickFrequency() / (current_tick - prev_tick)
    prev_tick = current_tick
    cv2.putText(frame, "FPS: {:.2f}".format(fps), (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)

    cv2.imshow('video', frame)
    cv2.waitKey(1)

运行效果:

在这里插入图片描述

五、切片检测任务

当图像过于庞大,或者目标非常小的情况,是不利于目标的捕捉的,一种方式就是对图像进行却片分割成若干区域,分别进行检测,然后再对结果合并和 NMS 过滤,这种方式 Supervision 也为我们封装好了,提供了 InferenceSlicer 类可以快速实现该功能。

实现过程:

import cv2
import supervision as sv
from ultralytics import YOLO
import numpy as np

# 加载 yolov8n 模型
model = YOLO("yolov8n.pt")

# 切片之后的处理
def callback(image_slice: np.ndarray) -> sv.Detections:
    result = model(image_slice, verbose=False)[0]
    return sv.Detections.from_ultralytics(result)

# 初始化切片器
slicer = sv.InferenceSlicer(
    callback=callback,  # 切片后每张子图进行处理的回调函数
    slice_wh=(320, 320),  # 切片后子图的大小
    overlap_ratio_wh=(0.3, 0.3),  # 连续切片之间的重叠率
    iou_threshold=0.5,  # 子图合并nms的iou阈值
    thread_workers=4  # 处理线程数
)
# 展现对象
box_annotator = sv.BoxAnnotator()
label_annotator = sv.LabelAnnotator()

# 读取图像
image = cv2.imread("img/3.png")
result = model(
    image,
    device=[0]  # 如果是 cpu 则是 device='cpu'
)[0]
detections = sv.Detections.from_ultralytics(result)
image1 = box_annotator.annotate(image.copy(), detections=detections)
# 绘制标签
labels = [f"{
      
      class_name} {
      
      confidence:.2f}" for class_name, confidence in
          zip(detections['class_name'], detections.confidence)]
image1 = label_annotator.annotate(image1, detections=detections, labels=labels)
# 展现未切片的结果
cv2.imshow('img', image1)
cv2.waitKey(0)

# 切片检测
detections = slicer(image)
image2 = box_annotator.annotate(image.copy(), detections=detections)
# 绘制标签
labels = [f"{
      
      class_name} {
      
      confidence:.2f}" for class_name, confidence in
          zip(detections['class_name'], detections.confidence)]
image2 = label_annotator.annotate(image2, detections=detections, labels=labels)
# 展现切片后的结果
cv2.imshow('img', image2)
cv2.waitKey(0)

未切片前的结果:

在这里插入图片描述

切片后的结果:

在这里插入图片描述

可以看到右上角的汽车被检测出来了,但是左边也出现了个误检,可以通过置信度做一层过滤。

六、辅助展现任务

从上面具体任务可可以感觉出来,Supervision 已经将效果展现的操作封装好了,这里再介绍几种展现的方式。

6.1 目标检测

边框展现:

import cv2
import supervision as sv
from ultralytics import YOLO
import numpy as np

# 加载 yolov8n 模型
model = YOLO("yolov8n.pt")

# 方框展现对象
box_annotator = sv.BoxAnnotator()
label_annotator = sv.LabelAnnotator()

# 读取图像
image = cv2.imread("img/img.png")
result = model(
    image,
    device=[0]  # 如果是 cpu 则是 device='cpu'
)[0]
detections = sv.Detections.from_ultralytics(result)
image = box_annotator.annotate(image, detections=detections)
# 绘制标签
labels = [f"{
      
      class_name} {
      
      confidence:.2f}" for class_name, confidence in
          zip(detections['class_name'], detections.confidence)]
image = label_annotator.annotate(image, detections=detections, labels=labels)
cv2.imshow('img', image)
cv2.waitKey(0)

在这里插入图片描述
角点边框展现:

import cv2
import supervision as sv
from ultralytics import YOLO
import numpy as np

# 加载 yolov8n 模型
model = YOLO("yolov8n.pt")

# 角点边框展现对象
corner_annotator = sv.BoxCornerAnnotator(
    corner_length=15,
    thickness=2,
    color=sv.Color(r=255, g=0, b=0)
)
label_annotator = sv.LabelAnnotator()

# 读取图像
image = cv2.imread("img/img.png")
result = model(
    image,
    device=[0]  # 如果是 cpu 则是 device='cpu'
)[0]
detections = sv.Detections.from_ultralytics(result)
image = corner_annotator.annotate(image, detections=detections)
# 绘制标签
labels = [f"{
      
      class_name} {
      
      confidence:.2f}" for class_name, confidence in
          zip(detections['class_name'], detections.confidence)]
image = label_annotator.annotate(image, detections=detections, labels=labels)
cv2.imshow('img', image)
cv2.waitKey(0)

在这里插入图片描述

三角形投标展现:

import cv2
import supervision as sv
from ultralytics import YOLO
import numpy as np

# 加载 yolov8n 模型
model = YOLO("yolov8n.pt")

# 三角展现对象
triangle_annotator = sv.TriangleAnnotator(
    base = 30,
    height = 30,
    position = sv.Position['TOP_CENTER'],
    color=sv.Color(r=0, g=255, b=0)
)

label_annotator = sv.LabelAnnotator()

# 读取图像
image = cv2.imread("img/img.png")
result = model(
    image,
    device=[0]  # 如果是 cpu 则是 device='cpu'
)[0]
detections = sv.Detections.from_ultralytics(result)

labels = [f"{
      
      class_name} {
      
      confidence:.2f}" for class_name, confidence in
          zip(detections['class_name'], detections.confidence)]
image = label_annotator.annotate(image, detections=detections, labels=labels)

image = triangle_annotator.annotate(image, detections=detections)
cv2.imshow('img', image)
cv2.waitKey(0)

在这里插入图片描述

6.2 语义分割

import cv2
import supervision as sv
from ultralytics import YOLO

# 加载 yolov8n-seg 模型
model = YOLO("yolov8n-seg.pt")

# 语义分割展现对象
mask_annotator = sv.MaskAnnotator()
label_annotator = sv.LabelAnnotator()

# 读取图像
image = cv2.imread("img/img.png")
result = model(
    image,
    device=[0]  # 如果是 cpu 则是 device='cpu'
)[0]
detections = sv.Detections.from_ultralytics(result)

labels = [f"{
      
      class_name} {
      
      confidence:.2f}" for class_name, confidence in
          zip(detections['class_name'], detections.confidence)]
image = label_annotator.annotate(image, detections=detections, labels=labels)

image = mask_annotator.annotate(image, detections=detections)
cv2.imshow('img', image)
cv2.waitKey(0)

在这里插入图片描述