CLOSE_WAIT remember a large number of solutions

problem:

Application Rom mq

solution:

First, the analysis of ideas:

1. Symptom: 61616 View by netstat and associated connection status, found that more than 130 CLOSE_WAIT

2. CLOSE_WAIT caused so much what is the reason?

2.1 The main reason is in some cases an application closes a socket connection, but mq busy to read or write, do not close the connection.

2.2 is necessary to determine the code socket, once read 0, disconnect, read returns negative, check errno, AGAIN if not, the connection is disconnected.

3. After the cause why Rom mq CLOSE_WAIT service it?

linux assigned to a user file handle is limited, the CLOSE_WAIT state has been maintained, it means that a corresponding number of channels has been occupied, the number of handles on the line once reached, a new request can not be processed, the application may return Too many openfiles a large number of anomalies.

4. What is CLOSE_WAIT?

4.1 mq end to be connected, java services to the active side, closed in the passive case, mq FIN has been received, but not yet sent their FIN time, the connection state in the CLOSE_WAIT;

Second, the solution:

1. Restart mq

2, the following three parameters settings linux:

/ Proc / sys / net / ipv4 / tcp_keepalive_time only use the time when a keepalive, TCP transmission frequency of keepalive messages. The default is 2 hours.

/ Proc / sys / net / ipv4 / tcp_keepalive_intvl upon detection of no acknowledgment, re-transmits the frequency of detection. The default is 75 seconds.

/ Proc / sys / net / ipv4 / tcp_keepalive_probes found before connection fails, the number of keepalive probes TCP transmission. The default is 9. This value is determined by multiplying after tcp_keepalive_intvl, sent a connection can not have much time to respond after the keepalive.

III. Learn extensions

1. The client first sends FIN, the state enters FIN_WAIT1

Server receives FIN, send ACK, enter CLOSE_WAIT state, the client receives the ACK, enter FIN_WAIT2 state
server sends FIN, the state enters LAST_ACK
client receives FIN, send ACK, enter TIME_WAIT state, the server receives ACK, CLOSE state into the
duration twice as long MSL client TIME_WAIT, linux in the system is about 60s, converted into the CLOSE state

2. The service uses short link, and after each client request, the server will take the initiative to send a FIN to close the connection. Time_wait and finally into the state. For the Sheremetyevo web server, there will be a lot of TIME_WAIT state, letting the server can quickly TIME_WAIT recycling and reuse those resources, you can modify kernel parameters.

/Etc/sysctl.conf modified as follows:
# for a new connection, the number of cores to be transmitted connection request SYN decided to give up, should not be greater than 255, the default value is 5, corresponding to about 180 seconds
net.ipv4.tcp_syn_retries = 2
# indicates the time when only use keepalive, TCP transmission frequency of keepalive messages. The default is 2 hours, to 300 seconds
net.ipv4.tcp_keepalive_time = 1200

=. 3 net.ipv4.tcp_orphan_retries
# If the socket is represented by the closed end of the requirements, it determines the parameter held in the state FIN-WAIT-2 time
net.ipv4.tcp_fin_timeout = 30
# SYN indicates the length of the queue, the default is 1024 , increase the queue length is 8192, the number of network connections may accommodate more wait for a connection.
4096 = net.ipv4.tcp_max_syn_backlog
# indicate on SYN Cookies. When the SYN queue overflow occurs, to enable to process cookies, a small amount can prevent SYN ***, defaults to zero disables
net.ipv4.tcp_syncookies = 1

# Indicate on reuse. TIME-WAIT sockets allow re-used for new TCP connection, the default is zero disables
net.ipv4.tcp_tw_reuse. 1 =
# represents a rapid recovery of open TCP connections TIME-WAIT sockets, the default is zero disables
net.ipv4 .tcp_tw_recycle = 1

## before reducing the number of the timeout detection
net.ipv4.tcp_keepalive_probes. 5 =
## network optimization device receives the queue
net.core.netdev_max_backlog = 3000
performed after completion of modification / sbin / sysctl -p allows parameters to take effect.

Guess you like

Origin www.linuxidc.com/Linux/2019-08/160357.htm