Scatter/gather I/O (scatter-gather I/O)

Overview

Scatter/gather I/O is a method that can input and output multiple buffers in a single system call. It can write data from multiple buffers to a single data stream, and can also read a single data stream to multiple data streams. in the buffer. The reason for its name is that the data will be scattered to the specified buffer vector, or the data will be aggregated from the specified buffer vector. This input and output method is also called vector I/O (vector I/O). In contrast, standard read and write system calls (read, write) can be called linear I/O (linear I/O).

Scatter/gather I/O has several advantages over linear I/O:

Coding mode is more natural

If the data itself is segmented (such as variables of a predefined structure), vector I/O provides an intuitive way of handling data.

higher efficiency

A single vector I/O operation can replace multiple linear I/O operations.

better performance

In addition to reducing the number of system calls made, vector I/O can provide better performance than linear I/O through internal optimizations.

Atomicity

Unlike multiple linear I/O operations, a process can perform a single vector I/O operation, avoiding the risk of interleaved operations with other processes.

readv() 和 writev()

Linux implements a set of system calls that implement the scatter/gather I/O mechanism defined in POSIX 1003.1-2001. This implementation satisfies all the properties described above.

The readv() function reads count segments (a segment is an iovec structure) from the file descriptor fd into the buffer specified by the parameter iov:

#include <sys/uio.h>

ssize_t readv (int fd,

                       const struct iovec *iov,

                       int count)

The writev() function reads count segments of data from the buffer specified by the parameter iov and writes them to fd:

#include <sys/uio.h>

ssize_t writev(int fd,

                       const struct iovec *iov,

                       int count)

 Except for operating multiple buffers at the same time, the functions of readv() and writev() are the same as those of read() and write() respectively.

Each iovec structure describes an independent, physically discontinuous buffer, which we call a segment:

#include <sys/uio.h>

struct iovec {

       void      *iov_base;/* pointer to start of buffer */

       size_t   iov_len;/* size of buffer in bytes */

};

A collection of segments is called a vector. Each segment describes the address and length of the buffer to be read and written in memory. The readv() function fills up iov_len bytes of the current buffer before processing the next buffer. The write() function will output all iov_len bytes of data in the current buffer before processing the next buffer. Both functions will process the segments in the vector sequentially, starting with iov[0], followed by iov[1] , up to iov[count - 1].

return value

When the operation is successful, the readv() function and the write() function respectively return the number of bytes read and written. The return value should be equal to the sum of all count iov_lens. On error, -1 is returned, and the errno value is set accordingly. These system calls may return any error that read() and write() may return, and on error, set the same errno value as read(), write(). In addition, the standard defines two other error scenarios.

 

In the first scenario, since the return value type is ssize_t, if the sum of all count iov_len exceeds SSIZE_MAX, no data will be processed, -1 will be returned, and the errno value will be set to EINVAL.

 

第二种场景,POSIX 指出count值必须大于0,且小于等于IOV_MAX(IOV_MAX在文件<limits.h>定义。在Linux中,当前 IOV_MAX的值是1024。如果count为0,该系统调用会返回0。如果count大于IOV_MAX,不会处理任何数据,返回-1,并把 errno值设置为EINVAL。

 

优化count值

在向量 I/O 操作中,Linux内核必须分配内部数据结构来表示每个段(segment)。一般来说,是基于count的大小动态分配进行的。然而,为了优化,如果 count值足够小,内核会在栈上创建一个很小的段数组,通过避免动态分配段内存,从而获得性能上的一些提升。count 的阀值一般设置为8,,因此如果count值小于或等于8时,向量I/O操作会以一种高效的方式,在进程的内核栈中运行。

 

大多数情况下,无法选择在指定的向量I/O操作中一次同时传递多少个段。当你认为可以调试一个较小值时,选择8或更小的值肯定会得到性能的提升。

 

Linux内核把readv() 和writev() 作为系统调用实现,在内部使用分散/聚集 I/O模式。实际上,Linux内核中的所有I/O都是向量I/O,read() 和 write() 是作为向量 I/O来实现的,且向量中只有一个段。

 

例子

writev() 例子:

#include <stdio.h>  
#include <sys/types.h>  
#include <sys/stat.h>  
#include <fcntl.h>  
#include <string.h>  
#include <sys/uio.h>  
  
int  main()  
{  
    struct iovec iov[3];  
    ssize_t nr;  
    int fd, i;  
  
    char *buf[] = {  
        "Just because you can do it, doesn't mean that you have to.\n",  
        "Just because you can do it, doesn't mean that you have to.\n",  
        "Just because you can do it, doesn't mean that you have to.\n" };  
  
    fd = open("c++.txt", O_WRONLY | O_CREAT | O_TRUNC);  
    if (fd == -1) {  
        perror("open");  
    }  
  
    /* fill out therr iovec structures */  
    for (i = 0; i < 3; ++i) {  
        iov[i].iov_base = buf[i];  
        iov[i].iov_len  = strlen(buf[i]) + 1;  
    }  
  
    /* write a single call, write them all out */  
    nr = writev(fd, iov, 3);  
    if (nr != -1) {  
        perror("writev");  
        return 1;  
    }  
  
    if (close(fd)) {  
        perror("close");  
    }  
  
    return 0;  
}  

 

readv() 例子:

#include <stdio.h>  
#include <sys/types.h>  
#include <sys/stat.h>  
#include <fcntl.h>  
#include <sys/uio.h>  
  
int main()  
{  
    char foo[48], bar[50], baz[49];  
    struct iovec iov[3];  
    ssize_t nr;  
    int fd, i;  
  
    fd = open("c++.txt", O_RDONLY);  
    if (fd == -1) {  
        perror("open");  
        return 1;  
    }  
  
    /* set up our iovec structrues */  
    iov[0].iov_base = foo;  
    iov[0].iov_len = sizeof(foo);  
    iov[1].iov_base = bar;  
    iov[1].iov_len = sizeof(bar);  
    iov[2].iov_base = baz;  
    iov[2].iov_len = sizeof(baz);  
  
    /* read into the structures with a single call */  
    nr = readv(fd, iov, 3);  
    if (nr == -1) {  
        perror("readv");  
        return 1;  
    }  
  
    for (i = 0; i < 3; ++i) {  
        printf("%d: %s", i, (char*) iov[i].iov_base);  
    }  
  
    if (close(fd)) {  
        perror("close");  
        return 1;  
    }  
  
    return 0;  
} 

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326554785&siteId=291194637