TCP/IP protocol -- the reason for the existence of the TIME_WAIT state

1. A preliminary inspection of the actual problem
        found that when a new TCP connection cannot be created externally, there are a large number of TCP connections in the TIME_WAIT state on the online server (the most one is 10w+ for a single machine, and the TIME_WAIT generated by the module that caused the alarm is about 2w), which makes it impossible. Establish a new TCP connection with the downstream module.
        TIME_WAIT involves the state transition during the TCP connection release process, as well as the specific socket api's impact on the TCP state. Let's start to introduce these concepts step by step.

2. TCP state transition
       The connection-oriented TCP protocol requires a TCP connection to be established before each peer-to-peer communication. The connection can be abstracted into a four-tuple (sometimes called socket pair): (local_ip, local_port, remote_ip, remote_port), these 4 elements uniquely represent a TCP connection.
       1) TCP Connection Establishment
       The process of TCP establishing a connection is usually called " three-way handshake ", which can be illustrated by the following figure:

       bubuko.com, bubuko

      The above figure can be explained as follows:
        a. The client sends a SYN to the server and agrees that the initial packet sequence number (sequence number) is J;
        b. The server sends its own SYN and indicates that the initial packet sequence number is K, and at the same time, the SYN J for the client returns ACK J+1 (Note: J+1 indicates that the server expects the next packet sequence from the client to be J+1);
        c. After the client receives the SYN+ACK from the server, it sends ACK K+1 . So far, TCP Build success.
        In fact, during the 3-way handshake process when TCP is established, parameters such as MSS and timestamp must be negotiated through the SYN package, which involves the details of the protocol.

           2) TCPConnection Termination
       corresponds to the 3-way handshake for establishing a connection. When releasing a TCP connection, it needs to go through four steps of interaction (also known as "four waves"), as shown in the following figure:

        bubuko.com, bubuko
         The above figure can be explained as follows:
       a. One side of the connection first calls close() to initiate an active close, which will prompt the TCP transport layer to send a FIN packet to the remotepeer, which indicates that the application that initiated the active close will no longer send data (special note: here "no longer The promise of "send data" is from the perspective of the application layer. At the TCP transport layer, the data that has not been sent in the kernel tcp send buffer corresponding to the application is still to be sent to the link).                
       After the remote peer receives the FIN, it needs to complete the passive close (passive close), which is divided into two steps:
       b. First, at the TCP transport layer, first send an ACK packet for the other party's FIN packet (the main ACK packet sequence is in the other party's FIN packet) Add 1 to the packet sequence);
       c. Then, after the application of the application layer receives the EOF (end-of-file, the FIN packet of the other party is passed to the application of the application layer as EOF), it knows that this connection will not There is more data from the other party, so close() is also called to close the connection, which will prompt the TCP transport layer to send FIN.
       d. After the peer that initiated the active shutdown receives the FIN of the remote peer, it sends an ACK packet. At this point, the TCP connection is closed.
Note 1: Either side of the TCP connection can first call close() to initiate an active close. The above figure shows that the client initiates an active close, not that only the client initiates an active close. Note 2:       
       上面给出的TCP建立/释放连接的过程描述中,未考虑由于各种原因引起的重传、拥塞控制等协议细节,感兴趣的同学可以查看各种TCP RFC Documents ,比如TCP RFC793

        3)TCP StateTransition Diagram
       上面介绍了TCP建立、释放连接的过程,此处对TCP状态机的迁移过程做总体说明。将TCP RFC793中描述的TCP状态机迁移图摘出如下(下图引用自这里):

     bubuko.com, bubuko
          TCP状态机共含11个状态,状态间在各种socket apis的驱动下进行迁移,虽然此图看起来错综复杂,但对于有一定TCP网络编程经验的同学来说,理解起来还是比较容易的。限于篇幅,本文不准备展开详述,想了解具体迁移过程的新手同学,建议阅读《Linux Network Programming Volume1》第2.6节。

3. TIME_WAIT状态
        
经过前面的铺垫,终于要讲到与本文主题相关的内容了。 ^_^
        从TCP状态迁移图可知,只有首先调用close()发起主动关闭的一方才会进入TIME_WAIT状态,而且是必须进入(图中左下角所示的3条状态迁移线最终均要进入该状态才能回到初始的CLOSED状态)。
        从图中还可看到,进入TIME_WAIT状态的TCP连接需要经过2MSL才能回到初始状态,其中,MSL是指Max
Segment Lifetime,即数据包在网络中的最大生存时间。每种TCP协议的实现方法均要指定一个合适的MSL值,如RFC1122给出的建议值为2分钟,又如Berkeley体系的TCP实现通常选择30秒作为MSL值。这意味着TIME_WAIT的典型持续时间为1-4分钟。
       TIME_WAIT状态存在的原因主要有两点:
    
   1)为实现TCP这种全双工(full-duplex)连接的可靠释放
       参考本文前面给出的TCP释放连接4次挥手示意图,假设发起active close的一方(图中为client)发送的ACK(4次交互的最后一个包)在网络中丢失,那么由于TCP的重传机制,执行passiveclose的一方(图中为server)需要重发其FIN,在该FIN到达client(client是active close发起方)之前,client必须维护这条连接的状态(尽管它已调用过close),具体而言,就是这条TCP连接对应的(local_ip, local_port)资源不能被立即释放或重新分配。直到romete peer重发的FIN达到,client也重发ACK后,该TCP连接才能恢复初始的CLOSED状态。如果activeclose方不进入TIME_WAIT以维护其连接状态,则当passive close方重发的FIN达到时,active close方的TCP传输层会以RST包响应对方,这会被对方认为有错误发生(而事实上,这是正常的关闭连接过程,并非异常)。
        2)为使旧的数据包在网络因过期而消失
       为说明这个问题,我们先假设TCP协议中不存在TIME_WAIT状态的限制,再假设当前有一条TCP连接:(local_ip, local_port, remote_ip,remote_port),因某些原因,我们先关闭,接着很快以相同的四元组建立一条新连接。本文前面介绍过,TCP连接由四元组唯一标识,因此,在我们假设的情况中,TCP协议栈是无法区分前后两条TCP连接的不同的,在它看来,这根本就是同一条连接,中间先释放再建立的过程对其来说是“感知”不到的。这样就可能发生这样的情况:前一条TCP连接由local peer发送的数据到达remote peer后,会被该remot peer的TCP传输层当做当前TCP连接的正常数据接收并向上传递至应用层(而事实上,在我们假设的场景下,这些旧数据到达remote peer前,旧连接已断开且一条由相同四元组构成的新TCP连接已建立,因此,这些旧数据是不应该被向上传递至应用层的),从而引起数据错乱进而导致各种无法预知的诡异现象。作为一种可靠的传输协议,TCP必须在协议层面考虑并避免这种情况的发生,这正是TIME_WAIT状态存在的第2个原因。
       具体而言,local peer主动调用close后,此时的TCP连接进入TIME_WAIT状态,处于该状态下的TCP连接不能立即以同样的四元组建立新连接,即发起active close的那方占用的local port在TIME_WAIT期间不能再被重新分配。由于TIME_WAIT状态持续时间为2MSL,这样保证了旧TCP连接双工链路中的旧数据包均因过期(超过MSL)而消失,此后,就可以用相同的四元组建立一条新连接而不会发生前后两次连接数据错乱的情况。

另一比较深入的说法

TIME_WAIT状态的存在有两个理由:(1)让4次握手关闭流程更加可靠;4次握手的最后一个ACK是是由主动关闭方发送出去的,若这个ACK丢失,被动关闭方会再次发一个FIN过来。若主动关闭方能够保持一个2MSL的TIME_WAIT状态,则有更大的机会让丢失的ACK被再次发送出去。(2)防止lost duplicate对后续新建正常链接的传输造成破坏。lost duplicate在实际的网络中非常常见,经常是由于路由器产生故障,路径无法收敛,导致一个packet在路由器A,B,C之间做类似死循环的跳转。IP头部有个TTL,限制了一个包在网络中的最大跳数,因此这个包有两种命运,要么最后TTL变为0,在网络中消失;要么TTL在变为0之前路由器路径收敛,它凭借剩余的TTL跳数终于到达目的地。但非常可惜的是TCP通过超时重传机制在早些时候发送了一个跟它一模一样的包,并先于它达到了目的地,因此它的命运也就注定被TCP协议栈抛弃。另外一个概念叫做incarnation connection,指跟上次的socket pair一摸一样的新连接,叫做incarnation of previous connection。lost duplicate加上incarnation connection,则会对我们的传输造成致命的错误。大家都知道TCP是流式的,所有包到达的顺序是不一致的,依靠序列号由TCP协议栈做顺序的拼接;假设一个incarnation connection这时收到的seq=1000, 来了一个lost duplicate为seq=1000, len=1000, 则tcp认为这个lost duplicate合法,并存放入了receive buffer,导致传输出现错误。通过一个2MSL TIME_WAIT状态,确保所有的lost duplicate都会消失掉,避免对新连接造成错误。

 

Q: 编写 TCP/SOCK_STREAM 服务程序时,SO_REUSEADDR到底什么意思? 

 

A: This socket option informs the kernel that if the port is busy, but the TCP state is in TIME_WAIT, it can be reused 

port. If the port is busy and the TCP state is in other states, you still get an error message when reusing the port, 

Indicates "Address already in use". If your service program is stopped and you want to restart immediately, but the new socket is still 

Using the same port, the SO_REUSEADDR option is useful here. It must be realized that any non-period 

Waiting for the data to arrive, it may cause the service program to respond chaotically, but this is only a possibility, in fact it is very unlikely 

possible.

 

http://www.mamicode.com/info-detail-190400.html

http://www.cnblogs.com/li-hao/archive/2011/12/08/2280678.html

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326194666&siteId=291194637