The pit that QUIC has stepped on

The pit that QUIC has stepped on


  1. Unreliable max data frame
  - When the number of bytes sent by the packet reaches the upper limit negotiated by the server (flow control), the server can send a max xxx data frame to allow the client to expand the window. However, QUIC stipulates that ack-only packets are not allowed to respond, which means that if the max xxx data frame is lost, and all subsequent packets are ack only packets. The client can only know that a packet has been lost, but cannot inform the server that it has lost a packet through ack, then the client will be blocked.

  2. Unreliable ack frame

  - Tracking of acks (especially ats) is not implemented on many implementations. If there are a large number of lost acks, it will lead to a large number of false retransmissions on the client side. QUIC, since 09, needs to track acknowledgments for packets containing ack frames. And it is recommended to send max data or ping randomly to avoid ack-only packets. After 11, it is stipulated that acks should contain old packets that have been acknowledged to avoid false retransmissions when acks are lost.
  - The delivery timing of acks was established in versions after 11. Before that, there were only two strategies for ack delivery, delayed delivery and one ack per packet. Delayed sending will cause the recovery state to be too slow, and one ack per packet obviously cannot take advantage of the ack range. According to RFC5681, ack should be sent every 2 full packets (corresponding to 2 MSS of TCP) or delayed by 25ms, so that ack will not be delayed excessively. And out-of-order packets should respond faster to speed up recovery.
  

3. The collapse of the congestion window

  - When we lose multiple packets during a network fluctuation, then each packet loss will halve the window, and subsequent treasures follow the fast recovery algorithm. Obviously this causes the window to drop to a small value next time, and to recover quite slowly. Recovery pos has been introduced since 2009 to avoid the loss of multiple packets due to a network fluctuation (previously, there was a bug in ngtcp2 that led to a long investigation).
  - When a packet is deemed lost due to an ack received. The sliding window is halved (cwnd = bytes_in_flight / 2), and it is obvious that bytes_in_flight > cwnd. This means that the sender is not allowed to send any data until bytes_in_flight falls below cwnd. This is going to be quite a long time (not resolved yet).


4. Restricted RTO retransmission

  - On some implementations RTO retransmits 2 packets, subject to flow control. Once RTO occurs, the current bytes_in_flight must be greater than cwnd, so the two packets will be blocked until bytes_in_flight < cwnd. In fact, the previous QUIC version stipulated that the 2 packets sent by RTO should not be controlled by cwnd, so that the 2 packets sent out can lead to the ack on the server side to speed up the recovery speed.


5. Unreliable PMTU

  - TCP has its own set of PMTU detection algorithms (RFC xxxx I forgot). But current (basically all implementations) implementations do not implement this algorithm. This makes that if the number of bytes of a packet sent is greater than the MTU of the current route, the packet will be split into many small IP packets, which increases the probability of a QUIC packet being lost.


6. Dedicated accelerator for TCP

  - At present, many network cards support TSO and TRO, which makes TCP packetization simpler and faster. But QUIC doesn't


7. Sanctions by operators

  - During the actual measurement, the RTTs measured by QUIC and TCP in the intranet environment tend to have the same distribution. But in the public network environment, TCP 80 is more than 10 times faster than the RTT measured by udp 4433 at peak time (tcp 30-40ms quic 300-500ms). This shows that udp is much slower than tcp in public network transmission.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325483925&siteId=291194637