Linux Network Programming-Discussion on Asynchronous I/O

concept

The first is blocking I/O. Blocking the read request initiated by I/O, the thread will be suspended until the kernel data is ready, and the data is copied from the kernel area to the application buffer. When the copy process is completed, the read request call returns. Next, the application can analyze the data in the buffer.

                                                         

The second is non-blocking I/O. The non-blocking read request returns immediately when the data is not ready. The application can continuously poll the kernel until the data is ready. The kernel copies the data to the application buffer and completes the read call. Note that the last read call here, the process of obtaining data, is a synchronous process. The synchronization here refers to the process of copying the data in the kernel area to the buffer area.

                                                     

It is not economical to ask the application to poll whether the kernel I/O is ready every time, because the application process cannot do anything during the polling process. As a result, I/O multiplexing technologies such as select and poll are on the stage. Through I/O event distribution, when the kernel data is ready, the application is notified to operate. This approach greatly improves the CPU utilization of the application process. The application process can use the CPU to do other things without being notified.

Note that the process of calling read and obtaining data is also a synchronous process.

                                                  

The first type of blocking I/O, the application will be suspended until the data is obtained. The second non-blocking I/O and the third non-blocking I/O-based multiplexing technology, the operation of obtaining data will not be blocked. Here, they are all synchronous calling technologies . Why do you say that? Because the terms of synchronous call and asynchronous call are for the process of obtaining data, the first several read operation calls that finally obtain data are all synchronous. During the read call, the kernel copies the data from the kernel space to the application. Space, this process is performed synchronously in the read function. If the copy efficiency implemented by the kernel is very poor, the read call will consume a relatively long time in this synchronization process.

But the real asynchronous calls do not worry about this issue, we are going to introduce a fourth I / O technology , when we launched aio_read, returns immediately, within the nucleus automatically copy data from kernel space into the application space , this The copy process is asynchronous, and the kernel is done automatically. Unlike the previous synchronous operation, the application does not need to initiate the copy action.

                                                   

A table is placed here to summarize the above I/O models.

                                     

Usage of aio_read and aio_write

First, a program example:

const int BUF_SIZE = 512;

int main() {
    int err;
    int result_size;
    // 创建一个临时文件
    char tmpname[256];
    snprintf(tmpname, sizeof(tmpname), "/tmp/aio_test_%d", getpid());
    unlink(tmpname);
    int fd = open(tmpname, O_CREAT | O_RDWR | O_EXCL, S_IRUSR | S_IWUSR);
    if (fd == -1) {
        error(1, errno, "open file failed ");
    }
    char buf[BUF_SIZE];
    struct aiocb aiocb;

    //初始化buf缓冲,写入的数据应该为0xfafa这样的,
    memset(buf, 0xfa, BUF_SIZE);
    memset(&aiocb, 0, sizeof(struct aiocb));
    aiocb.aio_fildes = fd;
    aiocb.aio_buf = buf;
    aiocb.aio_nbytes = BUF_SIZE;
    //开始写
    if (aio_write(&aiocb) == -1) {
        printf(" Error at aio_write(): %s\n", strerror(errno));
        close(fd);
        exit(1);
    }
    //因为是异步的,需要判断什么时候写完
    while (aio_error(&aiocb) == EINPROGRESS) {
        printf("writing... \n");
    }
    //判断写入的是否正确
    err = aio_error(&aiocb);
    result_size = aio_return(&aiocb);
    if (err != 0 || result_size != BUF_SIZE) {
        printf(" aio_write failed() : %s\n", strerror(err));
        close(fd);
        exit(1);
    }

    //下面准备开始读数据
    char buffer[BUF_SIZE];
    struct aiocb cb;
    cb.aio_nbytes = BUF_SIZE;
    cb.aio_fildes = fd;
    cb.aio_offset = 0;
    cb.aio_buf = buffer;
    // 开始读数据
    if (aio_read(&cb) == -1) {
        printf(" air_read failed() : %s\n", strerror(err));
        close(fd);
    }
    //因为是异步的,需要判断什么时候读完
    while (aio_error(&cb) == EINPROGRESS) {
        printf("Reading... \n");
    }
    // 判断读是否成功
    int numBytes = aio_return(&cb);
    if (numBytes != -1) {
        printf("Success.\n");
    } else {
        printf("Error.\n");
    }

    // 清理文件句柄
    close(fd);
    return 0;
}

Here, the main functions used are:

  • aio_write: used to submit asynchronous write operations to the kernel;
  • aio_read: used to submit asynchronous read operations to the kernel;
  • aio_error: Get the status of the current asynchronous operation;
  • aio_return: Get the number of bytes read and written by asynchronous operations.

The program initially submitted an asynchronous file write operation to the kernel using the aio_write method. The structure aiocb is the asynchronous application data structure passed by the application to the operating system kernel. Here we use the file descriptor, the buffer pointer aio_buf and the number of bytes that need to be written aio_nbytes.

struct aiocb {
   int       aio_fildes;       /* File descriptor */
   off_t     aio_offset;       /* File offset */
   volatile void  *aio_buf;     /* Location of buffer */
   size_t    aio_nbytes;       /* Length of transfer */
   int       aio_reqprio;      /* Request priority offset */
   struct sigevent    aio_sigevent;     /* Signal number and value */
   int       aio_lio_opcode;       /* Operation to be performed */
};

Next, we used aio_read to read the data from the file. To this end, we have prepared a new aiobc structure to tell the kernel that the data needs to be copied to the buffer, which is the same as asynchronous writing. After an asynchronous read is initiated, the result of the asynchronous read action is always inquired.

Next, when we run this program, we see a series of characters printed on the screen, showing that this operation is done for us by the kernel in the background.

./aio01
writing... 
writing... 
writing... 
writing... 
writing... 
writing... 
writing... 
writing... 
writing... 
writing... 
writing... 
writing... 
writing... 
writing... 
Reading... 
Reading... 
Reading... 
Reading... 
Reading... 
Reading... 
Reading... 
Reading... 
Reading... 
Success.

Open the aio_test_xxxx file in the /tmp directory, and you can see that this file has successfully written the data we expect.

                                

Asynchronous support for socket sockets under Linux

The aio series of functions are asynchronous operation interfaces defined by POSIX. Unfortunately, the aio operation under Linux is not supported by the real operating system level. It is only implemented by the GNU libc library functions in the user space by pthread, and only For disk I/O, socket I/O is not supported .

There are also many Linux developers who try to directly support aio in the operating system kernel. For example, a person named Ben LaHaise successfully merged aio into 2.5.32. This part of the ability exists as a patch, but it still does not Support sockets.

Solaris does have other aio systems, but it's not sure how it performs on sockets, especially how it compares to disk I/O.

Based on the above conclusions, the support for asynchronous operations under Linux is very limited, which is the fundamental reason why multiple distribution technologies such as epoll and non-blocking I/O are used to solve the problem of high concurrency and high performance network I/O under Linux.

Unlike Linux, Windows implements a complete set of asynchronous programming interfaces that support sockets. This set of interfaces is generally called IOCompletetionPort (IOCP). In this way, the so-called Proactor mode based on IOCP is produced. Whether it is Reactor mode or Proactor mode, it is a network programming mode based on event distribution. The Reactor mode is based on the I/O events to be completed, while the Proactor mode is based on the completed I/O events. The essence of both is based on the idea of ​​event distribution, designed to be compatible, extensible, and interface friendly. A set of program frameworks.

 

In short, the read and write actions of asynchronous I/O are automatically completed by the kernel. However, currently only simple aio asynchronous operations based on local files are supported under Linux . This also makes us prefer Reactor mode when writing high-performance network programs. I/O distribution technology such as epoll has been developed ; and IOCP under Windows is an asynchronous I/O technology, and this has produced the Proactor mode, which is as famous as Reactor. With this mode, high performance under Windows can be achieved. Network programming.

 

Learn the new by reviewing the past!

 

 

Guess you like

Origin blog.csdn.net/qq_24436765/article/details/104829772