SPDK问题排查之一

现象

运行SPDK程序,出现下面的错误:


starting write I/O failed, push back, reback to previous status
starting write I/O failed, push back, reback to previous status
starting write I/O failed, push back, reback to previous status
starting write I/O failed, push back, reback to previous status
starting write I/O failed, push back, reback to previous status

导致程序无法执行。什么原因呢?

分析过程

NVME Hardware queue 的使用禁忌

参考NVME 协议,可以看到hardware queue 是由一个submittion queue 和
completion queue 组成,两种配合才能处理IO请求:

SPDK问题排查之一

参考协议中的说明:

When host software builds a command for the controller to execute, it first checks to make sure that the appropriate Submission Queue (SQx) is not full. The Submission Queue is full when the number of entries in the queue is one less than the queue size. Once an empty slot (pFreeSlot) is available:
1. Host software builds a command at SQx[pFreeSlot] with:
a. CDW0.OPC is set to the appropriate command to be executed by the controller;
b. CDW0.FUSE is set to the appropriate value, depending on whether the command is a
fused operation;
c. CDW0.CID is set to a unique identifier for the command when combined with the
Submission Queue identifier;
d. The Namespace Identifier, CDW1.NSID, is set to the namespace the command applies to;
e. MPTR shall be filled in with the offset to the beginning of the Metadata Region, if there is a data transfer and the namespace format contains metadata as a separate buffer;
f. PRP1 and/or PRP2 (or SGL Entry 1 if SGLs are used) are set to the source/destination of data transfer, if there is a data transfer; and
g. CDW10 – CDW15 are set to any command specific information;
and
2. Host software writes the corresponding Submission Queue doorbell register (SQxTDBL)
to submit one or more commands for processing.
The write to the Submission Queue doorbell register triggers the controller to consume one or more new commands contained in the Submission Queue entry. The controller indicates the most recent SQ entry that has been consumed as part of reporting completions. Host software may use this information to determine when SQ slots may be re-used for new commands.

可以看到上面3、4、5、6步都是由NVME 控制器硬件完成的,而1/2 7/8 都由host 侧的软件完成,其中1、2有严格先后顺序的限制,7/8也有严格先后顺序的限制。

SPDK默认绑核方式

基于上面处理流程,SPDK提供了封装上面步骤1、2、7、8的API,作为一个函数使用。如果多个线程同时调用上面的API去控制同一组hard ware queue,就可能导致打破上面的操作顺序的限制。因此,在初始化的时候,SPDK线程会默认绑定到某个处理器核上去。

@@ -448,7 +448,7 @@ int init(const char * dev_name) {
     spdk_env_opts_init(&opts);
     opts.name = "append_demo";
     opts.shm_id = 0;
     opts.core_mask = "0x8";
     if (spdk_env_init(&opts) < 0) {
         fprintf(stderr, "Unable to initialize Spdk env\n");
         return -1;

SPDK线程注意事项

通过上面的分析可以看到:一组HW queue pair 不能同时给多个线程使用,但不同hard ware queue 分别被不同线程同时使用。

验证结果

根据上面的分析,修改了程序,错误一下子没有了。

猜你喜欢

转载自blog.51cto.com/xiamachao/2425054
今日推荐