A brief analysis of NIO principles (2)

IO classification

Blocking and non-blocking

Blocking IO: The user space triggers a system call in the kernel space. The return value will not be returned to the user space until the kernel IO operation is completely completed to perform the user's operation. Blocking refers to the execution state of the user space program, and the user space program needs to wait until the IO operation is completely executed. In Java, sockets created by default are blocking.

Non-blocking IO: User space triggers a system call in kernel space. There is no need to wait for the kernel IO operation to be completely completed. The kernel immediately returns a return value to the user. User space programs continue to perform user operations and are in a non-blocking state. In java, set up non-blocking IO, take socket as an example, see the code shown below:

serverChannel = ServerSocketChannel.open()
serverChannel.configureBlocking(false)

If you are interested, you can also refer to the documentation about the Socket class in java
Insert image description here

Synchronous and asynchronous

Synchronous IO: The calling initiation method of user space and kernel space. Synchronous IO means that the thread in the user space is the party that initiates the IO request actively, and the kernel space is the passive receiver.

Asynchronous IO: the calling initiation method of user space and kernel space. Asynchronous IO means that the thread in the user space is the passive receiver, but the kernel in the kernel space is the party that initiates the IO request actively.
Insert image description here

Four common IO models

Server-side programming needs to construct a high-performance IO model. There are four common IO models:

Insert image description here

Synchronous blocking IO (Blocking IO):

Based on the above description, synchronous blocking IO refers to a call actively initiated by user space, and then needs to wait for the kernel space to complete the IO operation before returning to user space. During this period, the user space thread will be in a blocked state.
Insert image description here

BIO advantages: The program is simple. During the period of blocking and waiting for data, the user suspends the thread, and the user thread basically does not occupy CPU resources.

BIO disadvantages: Each request may be configured with a separate set of threads. When the amount of concurrency is high, the cost of memory and thread switching is very high.

Application example: Using the thread pool in Java to connect to the database is the synchronous blocking IO model used.

Synchronous non-blocking IO (Non-blocking IO):

If the socket is set to non-blocking, if a system call occurs in the NIO model, the following two situations will occur:

(1) When there is no data in the kernel cache, a failure message will be returned immediately when a system call is initiated by user space.

(2) When there is data in the kernel cache, when a system call is initiated from user space, it will enter the blocking state and copy the data in the kernel cache to the user buffer. The blocking state will not be released until the data is returned successfully.

Insert image description here

Advantages of NIO: Every time an IO system call is initiated, the thread will return the value immediately while the kernel is waiting for buffer data without blocking. Real-time performance is better.

Disadvantages of NIO: It needs to continuously poll and initiate system calls, which will take up a lot of CPU time and the resource utilization rate is very low.

IO Multiplexing

First, understand multiplexing literally:

Multi-way: Multiple socket network connections
Reuse: Reuse a thread and use one thread to check the readiness status of multiple file sockets (also known as file handles)

IO multiplexing is a synchronous IO model that uses one thread to monitor multiple file handles. Once a file handle is ready, the application can be notified to perform corresponding read and write operations. Without a file handle ready, the application is blocked and the CPU's time slice is handed over.

By summarizing the previous two IO models, we can find:

For high-concurrency scenarios, the disadvantage of the synchronous blocking model is that frequent memory and thread switching is required, and the efficiency is very low. The disadvantage of synchronous non-blocking is that system calls must be initiated in polling in the user program space, which leads to frequent switching between kernel mode and user mode and consumes a lot of resources.

IO multiplexing can avoid frequent switching between kernel mode and user mode, because the IO multiplexing model places the action of polling sockets (also known as file handles) directly in the kernel mode, thus avoiding Eliminates frequent switching between kernel mode and user mode

for example

We take the basic socket model as an example to show the mechanism of IO multiplexing:

The following is the basic socket model pseudocode:

listenSocket = socket(); //系统调用socket()函数，调用创建一个主动socket
bind(listenSocket); //给主动socket绑定地址和端口
listen(listenSocket); //将默认的主动socket转换成服务器使用的被动socket(也叫监听socket)
while(true) {
    
     //循环监听客户端的连接请求
  connectSocket = accept(listenSocket); //接受客户端连接，获取已连接socket
  recv(connSocket); //从客户端读取数据，只能同时处理一个客户端
  send(connSocket); //给客户端返回数据，只能同时处理一个客户端
}

The process of network communication is shown in the figure below:
Insert image description here

The socket network communication shown in the figure above is a typical synchronous blocking model. When there are a large number of client connections, the processing performance of this model is relatively poor. This dilemma can be solved using IO multiplexing.

In Linux, the operating system provides three multiplexing mechanisms: select, poll, and epoll.

select mechanism

four questions

1. How many sockets can IO multiplexing monitor at most?

2. What events in the socket can be monitored by IO multiplexing?

3. How does IO multiplexing sense the ready file descriptor fd?

4. How does IO multiplexing realize network communication?

First, check the select function definition on the Linux platform. You can refer to article 1 Linux kernel select source code analysis and article 2 Linux select source code analysis .

/**
* 参数说明
* 监听的文件描述符数量 __nfds
* 被监听描述符的三个集合*__readfds, *__writefds 和 *__exceptfds
* 监听时阻塞等待的超时时长*__timeout
* 返回值：返回一个socket对应的文件描述符
*/
int select(int __nfds, fd_set * __readfds, 
           fd_set * __writefds, fd_set * __exceptfds, 
           struct timeval * __timeout)

The file descriptors monitored by the select function are divided into three categories, namely __readfds, __writefds and __exceptfds. When the user calls select, assuming that the ___readfds collection is currently monitored, the select operation will copy the ___readfds collection that needs to be monitored from the user space. Go to the kernel space, then traverse its own skb (SocketBuffer) in the kernel space, check the poll logic of each skb, and determine whether there is a readable event on the socket. If there is no socket to read, it will enter sleep state. When it is found that a sokcet is readable, the user space program will be awakened, and then the monitored collection will be traversed in the user space and the data will be read.

Insert image description here

Defects of select multiplexing method:

1. Calling select needs to copy the socket list from the user mode to the kernel mode. For multiple concurrent scenarios, the resource consumption is relatively large.

2. The number of port numbers that can be monitored is limited. For FD_SETSIZE, the 32-bit machine is limited to 1024 sockets, and the 64-bit machine is limited to 2048 sockets.

3. In the monitored fdlist list, if there is a socket data that can be read, the business needs to traverse the fdlist list in user mode, with a time complexity of O(n).

poll

Compared with select, poll optimizes the second flaw of select. It uses a dynamic array structure instead of the bitMap structure of select, breaking through the 1024 limit. However, poll does not solve flaw 1 and flaw 3, and there is still user mode to the kernel. The problem of excessive resource consumption caused by static socket replication.