YOLOv8-seg部署RK3588报错:Failed to call RockChipRga interface

Ubuntu22.04系统,RK3588开发板,我按照官方提供的方法,将自己训练的模型转为.rknn格式之后,通过交叉编译将程序部署到板端,运行后虽然成功推理出来了图像和分割结果,但中途报了个错误:Failed to call RockChipRga interface, please use ‘dmesg’ command to view driver error log.

1.完整的报错信息

全部的报错信息如下:

load lable ./model/coco_80_labels_list.txt
model input num: 1, output num: 13
input tensors:
  index=0, name=images, n_dims=4, dims=[1, 640, 640, 3], n_elems=1228800, size=1228800, fmt=NHWC, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003922
output tensors:
  index=0, name=587, n_dims=4, dims=[1, 64, 80, 80], n_elems=409600, size=409600, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=19, scale=0.103071
  index=1, name=onnx::ReduceSum_595, n_dims=4, dims=[1, 2, 80, 80], n_elems=12800, size=12800, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003492
  index=2, name=600, n_dims=4, dims=[1, 1, 80, 80], n_elems=6400, size=6400, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003492
  index=3, name=566, n_dims=4, dims=[1, 32, 80, 80], n_elems=204800, size=204800, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=11, scale=0.019777
  index=4, name=607, n_dims=4, dims=[1, 64, 40, 40], n_elems=102400, size=102400, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=11, scale=0.102898
  index=5, name=onnx::ReduceSum_615, n_dims=4, dims=[1, 2, 40, 40], n_elems=3200, size=3200, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003167
  index=6, name=619, n_dims=4, dims=[1, 1, 40, 40], n_elems=1600, size=1600, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003167
  index=7, name=573, n_dims=4, dims=[1, 32, 40, 40], n_elems=51200, size=51200, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=47, scale=0.027842
  index=8, name=626, n_dims=4, dims=[1, 64, 20, 20], n_elems=25600, size=25600, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=32, scale=0.081311
  index=9, name=onnx::ReduceSum_634, n_dims=4, dims=[1, 2, 20, 20], n_elems=800, size=800, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.002693
  index=10, name=638, n_dims=4, dims=[1, 1, 20, 20], n_elems=400, size=400, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.002693
  index=11, name=580, n_dims=4, dims=[1, 32, 20, 20], n_elems=12800, size=12800, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-2, scale=0.048666
  index=12, name=559, n_dims=4, dims=[1, 32, 160, 160], n_elems=819200, size=819200, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-119, scale=0.031364
model is NHWC input fmt
model input height=640, width=640, channel=3
scale=1.000000 dst_box=(0 140 639 499) allow_slight_change=1 _left_offset=0 _top_offset=140 padding_w=0 padding_h=280
rga_api version 1.10.1_[0]
fill dst image (x y w h)=(0 0 640 640) with color=0x72727272
 RgaCollorFill(1819) RGA_COLORFILL fail: Invalid argument
 RgaCollorFill(1820) RGA_COLORFILL fail: Invalid argument
374 im2d_rga_impl rga_task_submit(2171): Failed to call RockChipRga interface, please use 'dmesg' command to view driver error log.
374 im2d_rga_impl rga_dump_channel_info(1500): src_channel: 
  rect[x,y,w,h] = [0, 0, 0, 0]
  image[w,h,ws,hs,f] = [0, 0, 0, 0, rgba8888]
  buffer[handle,fd,va,pa] = [0, 0, 0, 0]
  color_space = 0x0, global_alpha = 0x0, rd_mode = 0x0

374 im2d_rga_impl rga_dump_channel_info(1500): dst_channel: 
  rect[x,y,w,h] = [0, 0, 640, 640]
  image[w,h,ws,hs,f] = [640, 640, 640, 640, rgb888]
  buffer[handle,fd,va,pa] = [10, 0, 0, 0]
  color_space = 0x0, global_alpha = 0xff, rd_mode = 0x1

374 im2d_rga_impl rga_dump_opt(1550): opt version[0x0]:

374 im2d_rga_impl rga_dump_opt(1551): set_core[0x0], priority[0]

374 im2d_rga_impl rga_dump_opt(1554): color[0x72727272] 
374 im2d_rga_impl rga_dump_opt(1563): 

374 im2d_rga_impl rga_task_submit(2180): acquir_fence[-1], release_fence_ptr[0x0], usage[0x280000]

rknn_run
-- matmul_by_cpu_uint8 use: 43.733002 ms
-- resize_by_opencv_uint8 use: 7.195000 ms
-- crop_mask_uint8 use: 2.451000 ms
-- seg_reverse use: 1.316000 ms
stem @ (239 50 251 90) 0.869
stem @ (408 90 414 123) 0.814
fruits @ (373 124 464 335) 0.808
fruits @ (219 90 279 231) 0.808
write_image path: out.png width=640 height=360 channel=3 data=0x7f890f7010

2.排查过程

  • 根据报错信息提示,用dmesg指令查看,另一个终端重新运行一次出错的问题:
dmesg -w
[ 4626.900161] rga_policy: invalid function policy
[ 4626.900171] rga_job: job assign failed
[ 4626.900173] rga_job: failed to get scheduler, rga_job_commit(407)
[ 4626.900178] rga_job: request[13] task[0] job_commit failed.
[ 4626.900181] rga_job: rga request commit failed!
[ 4626.900183] rga: request[13] submit failed!

在这里插入图片描述
在这里插入图片描述- 根据解读,好像是申请的内存太大了,官方说参考:https://gitee.com/hihope-rk3588/rk3588-librga/blob/master/samples/allocator_demo/src/rga_allocator_dma32_demo.cpp

在这里插入图片描述- 查看从瑞芯微官网拉下来的源码 rknn_model_zoo/examples/yolov8_seg/cpp,可以看到确实是申请的4G,那应该没问题。
在这里插入图片描述

3.对比测试验证

  • 经过上面排查之后发现各方面都没问题,我开始怀疑是不是官方给的代码有问题呢?为了做对比验证,我下载了YOLOv8官方提供的模型文件yolov8x-seg.pt,将其转换成.onnx.rknn格式,根据官方提供的测试图像bus.jpg,运行后发现并无问题。
    在这里插入图片描述
  • 这就证明官方的源代码没有问题,于是我用官方的模型推理了一张自己的图像(原先bus.jpg是640x640分辨率的,自己的测试图像是640x360分辨率的),发现也会有Failed to call RockChipRga interface的报错。
    在这里插入图片描述
  • 对比发现难道是因为测试图像的分辨率大小不一样导致的吗?于是我将自己的测试图像分辨率改成了640x640,用黑色进行填充,再次测试发现没问题了。
    在这里插入图片描述
  • 再用自己训练的模型对修改分辨率后的图像进行推理,也没问题了。
    在这里插入图片描述

结论

猜测是因为后处理部分的代码未对输入图像的尺寸进行处理,后面这部分代码需要修改。

我的对比结果

PS:bus.jpg是官方提供的640x640图像,ExperimentData0003.png是我测试的640x360图像,output_image.jpg是将我的测试图像填充为640x640的

使用瑞芯微官方的交叉编译代码:
(√)                                                     ./rknn_yolov8_seg_demo model/yolov8x-seg.rknn model/bus.jpg
(可以正常推理,报错Failed to call RockChipRga interface)   ./rknn_yolov8_seg_demo model/yolov8x-seg.rknn model/ExperimentData0003.png
(√)                                                     ./rknn_yolov8_seg_demo model/yolov8x-seg.rknn model/output_image.jpg
(√)                                                     ./rknn_yolov8_seg_demo model/yolov8x-seg.rknn model/output_image.png
                                                        
(X,Segmentation fault)                                ./rknn_yolov8_seg_demo model/tomato_seg.rknn model/bus.jpg
(可以正常推理,报错Failed to call RockChipRga interface)  ./rknn_yolov8_seg_demo model/tomato_seg.rknn model/ExperimentData0003.png
(X,Segmentation fault)                                ./rknn_yolov8_seg_demo model/tomato_seg.rknn model/output_image.jpg  
(√)                                                    ./rknn_yolov8_seg_demo model/tomato_seg.rknn model/output_image.png

1.和输入图像的尺寸有关系,尽量保证图像长宽一致,如640x640,否则RGA会报错
2.自己训练的模型,推理用png格式,如果用jpg会报错 Segmentation fault

https://github.com/airockchip/librga/blob/main/docs/Rockchip_FAQ_RGA_CN.md