Type in the URL to display the webpage, what happens during


 HTTP

The first job the browser does is to parse the URL

First of all, the first step done by the browser is to parse the URL to generate the request information sent to the web server.

 So the long URL in the figure actually requests the file resources in the server.

If the URL element in the blue part in the figure is omitted, which file should be requested?

When there is no path name, it means accessing the preset default file in the root directory, that is, /index.html or   /default.html, so that there will be no confusion.

Generate HTTP request information

After parsing the URL, the browser determines the Web server and file name, and the next step is to generate an HTTP request message based on this information.


DNS-real address lookup

After the URL is parsed by the browser and an HTTP message is generated, the operating system needs to be entrusted to send the message to the web server.

But before sending, there is still one more work to be done, which is to query the IP address corresponding to the server domain name , because when entrusting the operating system to send a message, the IP address of the communication object must be provided.

For example, when we make a call, we must know the phone number of the other party, but because the phone number is difficult to remember, we usually save the phone number + name in the address book.

Therefore, there is a server that specifically saves the correspondence between the domain name and IP of the web server, that is, the DNS server.

Domain name hierarchy

Domain names in DNS are separated by periods , such as www.server.com , where periods represent boundaries between different levels .

In domain name summary, the more right the position means the higher its level . (Foreign usage habits)

In fact, there is a dot at the end of the domain name, such as www.server.com. This last dot represents the root domain name. That is. The root domain is at the top level, and its next level is the .com top-level domain, and server.com is below it .

Therefore, the hierarchical relationship of the domain name is similar to a tree structure:

  • Root DNS server. 
  • Top-level domain DNS server.com
  • Authoritative DNS server server.com

The DNS server information of the root domain is stored in all DNS servers on the Internet.

In this way, any DNS server can find and access the root domain DNS server.

Therefore, the client can mainly find any DNS server, through which it can find the root domain DNS server, and then follow the clues to find a target DNS server located in the lower layer.

Workflow of domain name resolution

  1. The client will first send a DNS request, asking what the IP of www.server.com is, and send it to the local DNS server (that is, the DNS server address filled in the client's TCP/IP settings).
  2. After the local domain name server receives the request from the client, if the table in the cache can find www.server.com, it returns the IP address directly. If not, the local DNS will ask its root name server. The root domain name server is the highest level, it will not be directly used for domain name resolution, but it can point out a way.
  3. After the root DNS receives the request from the local DNS, it finds that the suffix is ​​.com, saying: "The domain name www.server.com is managed by the .com area. I will give you the address of the .com top-level domain name. You can ask him Bar".
  4. After the local DNS receives the address of the top-level domain name server, it initiates a request to the top-level domain name server: "Can you tell me the IP address of www.server.com?".
  5. The top-level domain name server said: "I will give you the address of the authoritative DNS server responsible for the www.server.com zone, you can ask it."
  6. The local DNS turns to ask the authoritative DNS server: "What is the IP disk address corresponding to www.server.com?" The authoritative DNS server of server.com is the original source of the domain name resolution result.
  7. The authoritative DNS server will inform the local DNS of the corresponding IP address xxxx after querying.
  8. The local DNS returns the IP address to the client, and the client establishes a connection with the target.

 The whole process of DNS domain name resolution is similar to the process of finding someone to ask for directions in our daily life, only showing the way but not leading the way .

Does that mean there are so many steps to go through every time a domain name is resolved?

Of course not, there is also caching.

The browser will first check whether it has a cache for this domain name. If so, it will return directly. If not, it will ask the operating system, and the operating system will also check its own cache. If it has, it will return directly. If not, Go to the hosts file again, and there is nothing there, so I will ask the "local DNS server".


Protocol Stack - Guide Helper

After the IP is obtained through DNS, the HTTP transmission work can be handed over to the protocol stack in the operating system.

The interior of the protocol stack is divided into several parts, each undertaking different tasks. There are certain rules for the upper-lower relationship. The upper part will delegate work to the lower part, and the lower part will receive the entrusted work and execute it.

The application program (browser) entrusts the work of the protocol stack by calling the socket library. There are two parts in the upper part of the protocol stack, which are the TCP and UDP protocols responsible for sending and receiving data. These two transport protocols will accept the entrustment of the application layer to perform the operation of sending and receiving data.

The lower half of the protocol stack uses the IP protocol to control the network packet sending and receiving operation. When uploading data on the Internet, the data will be divided into pieces of network packets, and the operation of sending the network packets to the other party is in charge of IP.

In addition, IP also includes ICMP protocol and ARP protocol

  • ICMP is used to notify errors and various control information generated during network packet transmission
  • ARP is used to query the corresponding Ethernet MAC address according to the IP address.

The network card driver under the IP is responsible for controlling the network card hardware, and the bottom network card is responsible for completing the actual sending and receiving operations, that is, performing sending and receiving operations on the signals in the network cable.


TCP - reliable transport

TCP packet header format:

First of all, the source port number and the destination port number are indispensable. Without these two port numbers, the data will not know which application should be sent to.

Next is the serial number of the package , which is used to solve the problem of out-of-order packages.

The purpose of the confirmation sequence is to confirm whether the other party has received it. If it is not received, it should be resent until it is delivered. This is to solve the problem of packet loss.

Then there are some status bits . For example, SYN is to initiate a connection, ACK is to reply, RST is to reconnect, FIN is to end the connection, etc. TCP is connection-oriented, so both parties need to maintain the state of the connection, and the sending of these stateful packets will cause the state changes of both parties.

 Another important thing is the window size. TCP needs to do flow control . Both parties in communication declare a window (cache size) to identify their current processing capabilities. Don't send too fast or too slow.

In addition to flow control, TCP also performs congestion control .

Before TCP transmits data, a three-way handshake is required to establish a connection

Before HTTP transmits data, TCP first needs to establish a connection, and the establishment of a TCP connection is usually called a three-way handshake .

This so-called connection is just a state machine maintained in the computers of both parties. During the process of establishing a connection, the sequence diagram of the state changes of both parties is as follows:

  • At the beginning, both the client and the server are in the CLOSED state. First, the server actively monitors a window and is in the LISTEN state.
  • Then the client actively initiates the connection SYN , and then it is in the SYN-SENT state.
  • The server receives the initiated connection, returns SYN , and ACKs the client's SYN , and then stays in the SYN-RCVD state.
  • After the client receives the SYN and ACK sent by the server , it sends an ACK to confirm the SYN , and then it is in the ESTABLISHED state, because it has successfully sent and received.
  • After the server receives the ACK of the ACK , it is in the ESTABLISHED state, because it has also successfully sent and received.

 The purpose of the three-way handshake is to ensure that both parties have the ability to send and receive .

TCP split data

If the HTTP request message is relatively long and exceeds the length of the MSS , then TCP needs to disassemble the HTTP data into chunks of data to send instead of sending all the data at once.

  • MTU : the maximum length of a network packet, generally 1500 bytes in Ethernet
  • MSS : After removing the IP and TCP headers, the maximum length of TCP data that can be accommodated in a network packet.

The data will be divided by the length of MSS , and each piece of data will be put into a separate network packet. That is to add TCP header information to each split data, and then hand it over to the IP module to send the data.

 TCP packet generation

There are two ports in the TCP protocol, one is the window that the browser monitors (usually randomly generated), and the other is the port that the Web server monitors (the default port number of HTTP is 80, and the default port number of HTTPS is 433 )

After the two parties establish a connection, the data part of the TCP message is to store the HTTP header + data. After the TCP message is assembled, it needs to be handed over to the lower network layer for processing.


IP-remote location

When the TCP module performs various stages of operations such as connection, sending and receiving, and disconnection, it needs to entrust the IP module to encapsulate data into network packets and send them to the communication object.

IP header format

In the IP protocol, a source address IP and a destination address IP are required:

  • Source address IP, the IP address output by the client
  • Destination address, the IP address of the web server resolved through the DNS domain name.

Because HTTP is transmitted via TCP, the protocol number in the IP header should be filled with 06 (hexadecimal), indicating that the protocol is TCP.

 

IP packet generation

So far, the message of the network packet is as follows:


 

MAC-Two point transmission

After the IP header is generated, the next network packet needs to add a MAC header in front of the IP header .

MAC header format

The MAC header is the header used by Ethernet, which contains information such as the MAC address of the receiver and the sender.

 In the MAC header, the MAC address of the sender and the target MAC address of the receiver are required for transmission between two points.

Generally, in TCP/IP communication, the protocol type of the MAC header only uses:

  • 0800 : IP protocol
  • 0806 : ARP protocol

How does the MAC sender and receiver confirm?

It is relatively simple to obtain the MAC address of the sender . The MAC address is written into the ROM when the network card is produced. Just read this value and write it into the MAC header.

The MAC address of the receiving party is a bit complicated. Just tell the Ethernet the MAC address of the other party, and the Ethernet will help us send the packet there, so it is obvious that the MAC address of the other party should be filled in here.

So you have to figure out who should send the package to. You only need to check the routing table to know. Find the matching entry in the routing table, and then send the packet to the IP address in the Gateway column.

Now that you know who to send it to, how do you get the MAC address of the other party?

At this time, the ARP protocol is needed to help us find the MAC address of the router .

The ARP protocol will broadcast in the Ethernet to all the devices on the Ethernet: "Who is this IP address? Please tell me the MAC address"

 Then if the device in the subnet confirms that it is its own IP, it will reply the MAC address. Then write this MAC address into the MAC header, and the MAC header is completed.

The subsequent operating system will put the result of this query into a memory space called ARP cache for future use, but the cache time is only a few minutes.

That is, when sending a package:

  • Query the ARP cache first. If the MAC address of the other party has been saved in it, there is no need to send an ARP query, and the address in the ARP cache is used directly.
  • And when the MAC address of the other party does not exist in the ARP cache, an ARP broadcast query is sent.

View ARP cache content

In the Linux system, we can use the arp -a command to view the contents of the ARP cache.

MAC message generation


NIC-Export

A network packet is just a string of binary digital information stored in memory, and there is no way to send it directly to the other party. Therefore, we need to convert digital information into electrical signals before they can be transmitted on the network cable, that is to say, this is the real data sending process.

It is the network card that is responsible for performing this operation , and a network card driver is required to control the network card .

After the network card driver obtains the network packet, it will copy it to the buffer in the network card, then add a header and a start frame delimiter at the beginning, and add a frame check sequence for detecting errors at the end .

  • The start delimiter is a marker used to indicate the beginning of the packet
  • The FCS (frame check sequence) at the end is used to check whether the packet transmission process is damaged

 Finally, the network card will convert the packet into an electrical signal and send it out through the network cable.


Switch - Farewell

Switches are designed to forward network packets to their destinations as-is. The switch works at the MAC layer, also known as a secondary network device

Packet receiving operation of the switch

First, the electrical signal reaches the network cable interface, and the module in the switch receives it, and then the module in the switch converts the electrical signal into a digital signal.

Then check the error through the FCS at the end of the packet, and put it in the buffer if there is no problem. This part of the operation is basically the same as the network card of the computer, but the working method of the switch is different from that of the network card.

The network card of the computer itself has a MAC address, and judges whether it is sent to itself by checking the receiver MAC address of the received packet. If it is not sent to itself, it is discarded; in contrast, the port of the switch does not check the receiver MAC address , but directly accepts all packets and stores them in the buffer. Therefore, unlike network cards, ports on switches do not have MAC addresses .

After the packet is stored in the buffer, it is necessary to check whether the MAC address of the recipient of the packet has been recorded in the MAC address table.

The MAC address table of the switch mainly contains two pieces of information:

  • One is the MAC address of the device
  • The other is which port on the switch the device is connected to

If the MAC address of the receiver of the received packet is  00-02-B3-1C-9C-F9then it matches the third line in the table in the figure. According to the information in the port column, it can be known that this address is located  3 on the No. port, and then the packet can be sent to the corresponding port through the switching circuit port up.

 Therefore, the switch looks up the MAC address according to the MAC address table, and then sends the signal to the corresponding port .

What happens when the MAC address table cannot find the specified MAC address?

The specified MAC address was not found in the address table. This may be because the device with the address has not sent a packet to the switch, or the device has not been working for a while and the address has been deleted from the address table.

In this case, the switch cannot determine which port the packet should be forwarded to, but can only forward the packet to all ports except the source port , no matter which port the device is connected to can receive the packet.

There is no problem doing this, because Ethernet is designed to send packets out across the network, and then only the appropriate recipients receive the packet, while other devices ignore the packet .

So doing so will send redundant packets, will it cause network congestion?

In fact, there is no need to worry at all, because the target device will respond after sending the packet. As long as the response packet is returned, the switch can write its address into the MAC address table, and there is no need to send the packet to all ports next time. up.

In addition, if the receiver MAC address is a broadcast address, then the switch will send the packet to all ports except the source port, the following two belong to the broadcast address:

  • FF:FF:FF:FF:FF:FF in MAC address
  • 255.255.255.255 in the IP address

Router - Exit Gate

The difference between a router and a switch

After the network packet passes through the switch, it reaches the router, where it is forwarded to the next router or destination device.

The working principle of forwarding in this step is similar to that of a switch, and it also judges the destination of packet forwarding by looking up the table.

However, in the specific operation process, there is a difference between a switch and a router.

  • Because the router is designed based on IP, it is commonly known as a three-level network device, and each port of the router has a MAC address and an IP address.
  • The switch is designed based on Ethernet, commonly known as a secondary network device, and the port of the switch does not have a MAC address.

Fundamentals of routers

The port of the router has a MAC address, because it can become the sender and receiver of Ethernet; it also has an IP address, in this sense, it is the same as the network card of the computer.

When forwarding a packet, first the router port will receive the Ethernet packet sent to itself, and then the routing table will query the forwarding target, and then the corresponding port will act as the sender to send the Ethernet packet.

The packet receiving operation of the router

First, the electrical signal reaches the interface of the network cable, and the module in the router will convert the electrical signal into a digital signal, and then  perform error checking at the end of the packet. FCS

If there is no problem, check the receiver's MAC address in the MAC header to see if it is a packet sent to itself. If it is, put it in the receiving buffer, otherwise discard the packet.

Generally speaking, the ports of routers all have MAC addresses, and only receive packets that match their own addresses, and discard packets that do not match.

Query the routing table to determine the output port

After completing the packet receiving operation, the router will remove the MAC header at the beginning of the packet.

The function of the MAC header is to deliver the packet to the router , and the receiver's MAC address is the MAC address of the router port. Therefore, when the packet reaches the router, the task of the MAC header is completed, and the MAC header is discarded .

Next, the router will forward the packet according to the content in the header behind the MAC header . IP 

The forwarding operation is divided into several stages. The first is to query the routing table to determine the forwarding target.

 The specific workflow is based on the above figure, for example.

Assuming that  10.10.1.101 the computer with the address is going  192.168.1.100 to send a packet to the server with the address, the packet first arrives at the router in the figure.

The first step in judging the forwarding target is to query the target address column in the routing table according to the IP address of the receiver of the packet to find a matching record.

Routing matching is the same as mentioned above. After the subnet mask and  192.168.1.100 IP of each entry are performed  with & operation , the result is matched with the target address of the corresponding entry. If it matches, it will be used as a candidate forwarding target. If it does not match, it will continue Route matching with the next entry.

If the subnet mask of the second entry  is ANDed with255.255.255.0  the  192.168.1.100 IP  , the result is that   this   matches the target address of the second entry, and the record of the second entry will be used as the forwarding target.192.168.1.0192.168.1.0

When no matching route can be found, the default route will be selected , and the record with the subnet mask in the routing table  0.0.0.0 indicates the "default route".

Router's send operation

The next step is to enter the sending operation of the package .

First, we need to judge the address of the other party according to the gateway column of the routing table .

  • If the gateway is an IP address, then this IP address is the target address we want to forward to. If we have not reached the end point , we need to continue forwarding by router.
  • If the gateway is empty, the receiver's IP address in the IP header is the destination address to be forwarded to, and the destination address in the IP header is finally found, indicating that the destination has been reached .

After knowing the IP address of the other party, the next step is to  ARP query the MAC address according to the IP address through the protocol, and use the query result as the MAC address of the receiver.

The router also has an ARP cache, so it will first search in the ARP cache, and if it cannot find it, it will send an ARP query request.

Next is the sender MAC address field, here fill in the MAC address of the output port. There is also an ether type field, fill in  0800 (hexadecimal) to indicate the IP protocol.

Once the network packet is complete, it is converted into an electrical signal and sent out through the port. The working process of this step is also the same as that of the computer.

The network packets sent out will reach the next router through the switch . Since the receiver's MAC address is the address of the next router, the switch will transmit the packet to the next router according to this address.

Next, the next router will forward the packet to the next router. After layer-by-layer forwarding, the network packet reaches the final destination.

PS: In the process of network packet transmission, the source IP and destination IP will never change, but the MAC address will always change , because the MAC address is required for packet transmission between two devices in the Ethernet .


Server and client - cheating each other

After the data packet arrives at the server, the server will first peel off the MAC header of the data packet to check whether it matches the server's own MAC address, and if it matches, the packet will be collected.

Then continue to peel off the IP header of the data packet and find that the IP address matches. According to the protocol item in the IP header, we know that the upper layer is the TCP protocol.

 So, I peeled off the TCP header, there is a sequence number in it, I need to see if the sequence packet is what I want, if it is, put it in the cache and return an ACK, if not, discard it. There is also a port number in the TCP header, and the HTTP server is listening to this port number.

Therefore, the server naturally knows that the HTTP process wants the package, so it sends the package to the HTTP process.

The HTTP process of the server sees that the request is to access a page, so it encapsulates the web page in the HTTP response message.

The HTTP response message also needs to put on the TCP, IP, and MAC headers, but this time the source address is the server IP address, and the destination address is the client IP address.

After putting on the clothes on the head, go out from the network card and forward it to the router out of the city by the switch, and the router sends the response data packet to the next router, just jumping and jumping like this.

Finally, it jumped to the router guarded by the city gate of the client. The router peeled off the IP header and found that it was looking for someone in the city, so it sent the packet to the switch in the city, and then the switch forwarded it to the client.

Guess you like

Origin blog.csdn.net/qq_48626761/article/details/132054992