(2003, "Can't connect to MySQL server on '127.0.0.1' (99)")

  • 原因:大并发mysql连接,time_wait积累导致端口耗尽。
  • 查看方式
    • 使用如下命令统计当前各种状态的连接数量
      • $ netstat -n | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}'
        TIME_WAIT 1788
        ESTABLISHED 3
    • 可能出现的情况及解释
      • CLOSED:无连接是活动的或正在进行
      • LISTEN:服务器在等待进入呼叫
      • SYN_RECV:一个连接请求已经到达,等待确认
      • SYN_SENT:应用已经开始,打开一个连接
      • ESTABLISHED:正常数据传输状态
      • FIN_WAIT1:应用说它已经完成
      • FIN_WAIT2:另一边已同意释放
      • ITMED_WAIT:等待所有分组死掉
      • CLOSING:两边同时尝试关闭
      • TIME_WAIT:另一边已初始化一个释放
      • LAST_ACK:等待所有分组死掉
  • time_wait产生原因
    当某个tcp端点关闭tcp连接时,会在内存中维护一个小的控制块,用来记录最近关闭链接的ip地址和端口号,这类信息只会维持一小段时间,通常是所估计的最大分段使用期的两倍(成为2MSL,通常为2分钟),在这段时间内无法重新创建两个具有相同ip地址和端口号的连接。

  • 解决方式

    • 减少time_wait连接等待的时间(通过修改内核参数解决)

      • 主要修改两个参数

        tcp_tw_recycle - BOOLEAN
        Enable fast recycling TIME-WAIT sockets. Default value is 0.  
        It should not be changed without advice/request of technical  
        experts.  
        net.ipv4.tcp_tw_recycle=1就是打开快速 TIME-WAIT sockets 回收,即快速回收处于TIME-WAIT的连接,默认值是0,即关闭状态
        
        tcp_tw_reuse - BOOLEAN  
        Allow to reuse TIME-WAIT sockets for new connections when it is  
        safe from protocol viewpoint. Default value is 0.  
        It should not be changed without advice/request of technical  
        experts. 
        允许重新应用处于TIME-WAIT状态的socket用于新的TCP连接,默认值是0,即关闭状态
        
        net.ipv4.tcp_fin_timeout
        设置连接超时时间,单位是秒,默认值是60
      • 修改方式
        $ vi /etc/sysctl.conf
        写入或修改如下内容:
        net.ipv4.tcp_fin_timeout = 2
        net.ipv4.tcp_tw_recycle = 1
        net.ipv4.tcp_tw_reuse = 1

    • 延时(100ms)

      I’m not sure this is something you’ll see in Real Life (tm). Try running
      an overnight test to see whether sleeping for 100 milliseconds between
      connections makes the problem go away. If it does, then you are just
      running our of available TCP ports.

      There’s a delay period after a TCP connection is closed and before the
      same port number can be re-used by another local process.

      If you run your test as it is currently written and after it fails run

      netstat -an

      you should see a large number of connections in the TIME_WAIT state. If
      so then you probably have nothing much to worry about.
      regards
      Steve

    • 增加客户端负载生成机器的数量,或这个确保客户端和服务器在循环使用几个虚拟ip地址已增加更多的连接组合,在本场景中因为是主要和数据库的交互,可以通过创建数据库连接池复用myslq连接较少新连接的生成.

猜你喜欢

转载自blog.csdn.net/yuberhu/article/details/78972791