HTTP Lecture 15 - HTTP Connection Management

short connection

The HTTP protocol was originally (0.9/1.0) a very simple protocol, and the communication process also adopted a simple "request-response" method.
Its underlying data transmission is based on TCP/IP. Before sending a request, it needs to establish a connection with the server, and it will close the connection immediately after receiving the response message.
Because the entire connection process between the client and the server is very short and will not remain connected to the server for a long time, it is called "short-lived connections". The early HTTP protocol was also known as a "connectionless" protocol.
insert image description here
insert image description here

for example:

Suppose your company bought a time card machine and placed it at the front desk. Because this machine is relatively expensive, a special protective cover was made to cover it. The company requires you to open the cover every time you clock in and out of get off work. Close the lid.
But this cover is very strong, and it takes a lot of effort to open and close. It may only take 1 second to punch the card, but it takes four to five seconds to open and close the cover. Most of the time is wasted on meaningless operation of opening and closing the cover.
It is conceivable that it is usually easy to say that there will be a long queue in front of the time card machine at the time of commute, and everyone has to repeat the three steps of "opening - punching - closing the cover". Don't worry if you say it .
In this metaphor, the time card machine is equivalent to the server, the switch of the lid is the connection and closure of TCP, and each person who punches the card is an HTTP request. Obviously, the shortcoming of the short connection seriously restricts the service ability of the server, making it unable to Handle more requests.

Long connection

In response to the shortcomings exposed by short connections, the HTTP protocol proposes a "long connection" communication method, also called "persistent connections", "keep alive", and "connection reuse". ).
In fact, the solution is also very simple, using the idea of ​​"cost sharing". Since the connection and closing of TCP is very time-consuming, then the time cost is allocated from the original "request-response" to multiple "request-response". "superior.
Although this cannot improve the connection efficiency of TCP, based on the "denominator effect", the invalid time of each "request-response" will be reduced a lot, and the overall transmission efficiency will also be improved.

Comparison diagram of short connection and long connection

insert image description here
Continuing with the punch card analogy just now, the company also felt that this repeated operation of "opening - punching - closing" was too "anti-human", so a new regulation was promulgated. After opening the cover in the morning, there is no need to close it, and you can freely Punch in and close the lid when you get off work.
In this way, the efficiency of clocking in (that is, service capability) has been greatly improved. It used to take five or six seconds to clock in once, but now it only takes one second. The scene of long queues at get off work is gone forever, and everyone is happy.

Connection related header fields

Since the performance improvement effect of long connection is very significant, the connection in HTTP/1.1 will enable long connection by default. There is no need to use any special header fields to specify, as long as the first request is sent to the server, subsequent requests will reuse the TCP connection opened for the first time, that is, a long connection, and send and receive data on this connection.
Of course, we can also explicitly require the use of the long connection mechanism in the request header, the field used is Connection, and the value is "keep-alive".
However, regardless of whether the client explicitly requires a long connection, if the server supports long connection, it will always put a "Connection: keep-alive" field in the response message, telling the client: "I support long connection, next Just use this TCP to send and receive data all the time."
However, the long connection also has some small shortcomings, the problem lies in its "long" word.
Because the TCP connection is not closed for a long time, the server must save its state in memory, which takes up server resources. If there are a large number of idle long connections that are not sent continuously, the resources of the server will be exhausted quickly, causing the server to be unable to provide services to users who really need them.
Therefore, the long connection also needs to be closed at the right time, and the connection with the server cannot be kept forever, which can be done on the client or the server.
On the client side, you can add the "Connection: close" field to the request header to tell the server: "Close the connection after this communication". When the server sees this field, it knows that the client will actively close the connection, so it also adds this field to the response message, and calls the Socket API to close the TCP connection after sending.
The server side usually does not actively close the connection, but some strategies can also be used. Take Nginx as an example, it has two methods:
1. Use the "keepalive_timeout" command to set the timeout period for long connections. If there is no data sent or received on the connection for a period of time, it will actively disconnect the connection to avoid idle connections occupying system resources.
2. Use the "keepalive_requests" command to set the maximum number of requests that can be sent on a long connection. For example, if it is set to 1000, then when Nginx processes 1000 requests on this connection, it will also actively disconnect.

In addition, both the client and the server can add the general header field "Keep-Alive: timeout=value" to the message to limit the timeout period of the long connection. However, the binding force of this field is not strong, and the two parties of the communication may not abide by it, so it is not very common.

head of line blocking

After reading the short connection and long connection, the next step is to talk about the famous "head-of-line blocking" (Head-of-line blocking, also called "head-of-line blocking").
"Head of line blocking" has nothing to do with short and long connections, but is caused by the basic "request-response" model of HTTP.
Because HTTP stipulates that messages must be "sent and received", this forms a first-in, first-out "serial" queue. The requests in the queue have no priority, only the order of entering the queue, and the request at the top is processed with the highest priority.
If the request at the head of the queue delays time because it is processed too slowly, then all subsequent requests in the queue have to wait together, and the result is that other requests bear undue time costs.
insert image description here
Or use a punch card machine as an analogy. At the time of work, everyone was queuing up to clock in, but at this time, the person at the front encountered a fault in the clock card machine, and couldn't punch in successfully no matter what, and was sweating profusely. When someone fixes the time card machine, everyone in line behind is late.

performance optimization

Because the "request-response" model cannot be changed, the "head-of-line blocking" problem cannot be solved in HTTP/1.1, but can only be alleviated. What can be done?
The company can buy a few more clock-in machines and place them at the front desk, so that everyone does not have to squeeze in one line and punch in scatteredly. It doesn’t matter if a line is blocked occasionally, and they can switch to other non-blocking lines.
This is "concurrent connections" in HTTP, that is, to initiate multiple long connections to a domain name at the same time, and use quantity to solve quality problems.
But this approach also has flaws. If each client wants to be fast and establish many connections, the number of users × the number of concurrent connections will be an astronomical number. The resources of the server can't bear it at all, or it is considered a malicious attack by the server, which will cause "denial of service" instead.
Therefore, the HTTP protocol recommends that clients use concurrency, but not "abuse" concurrency. RFC2616 clearly limits each client to a maximum of 2 concurrent connections. However, practice has proved that this number is too small, and many browsers "ignored" the standard and raised the upper limit to 6~8. The later revised RFC7230 also "pushed the boat along the way" and canceled the "2" restriction.
The company is developing too fast, with more and more employees, clocking in and out of get off work has become an imminent problem. The space at the front desk is limited, and there is no room for more time card machines. What should I do? Then open a few more check-in places, and put three or four check-in machines on each floor and at the entrance of the office area to further divert people, so as not to crowd all the way to the front desk.
This is the "domain sharding" technology, or the idea of ​​using quantity to solve quality.
Don't the HTTP protocol and browsers limit the number of concurrent connections? Ok, then I will open a few more domain names, such as shard1.chrono.com, shard2.chrono.com, and these domain names all point to the same server www.chrono.com, so that the number of actual long connections will increase again, really "Beautiful". However, it is really a bit like "there are policies at the top and countermeasures at the bottom".

summary

  1. The early HTTP protocol used a short connection and closed the connection immediately after receiving the response, which was very inefficient;
  2. HTTP/1.1 enables long connections by default, sending and receiving multiple request responses on one connection, improving transmission efficiency;
  3. The server will send the "Connection: keep-alive" field to indicate that the long connection is enabled;
  4. If there is "Connection: close" in the header, it means that the long connection is about to be closed;
  5. Too many long connections will occupy server resources, so the server will use some strategies to selectively close long connections;
  6. The "head of line blocking" problem will lead to performance degradation, which can be alleviated by "concurrent connections" and "domain name fragmentation" technologies.

PS: This article is a note after watching Geek.

Guess you like

Origin blog.csdn.net/Elon15/article/details/130730357