Linux C system programming (14) network programming foundation

1 Socket concept

Linux uses sockets to communicate between processes; through sockets, the location of other processes is transparent to the application; sockets represent communication endpoints, and it must be ensured that each of the two endpoints has a socket . The communication process of the socket is as follows:
 

Sockets implement a layer of abstraction, making users feel like they are working on files. The abstract process is as follows:


2 Preparation

2.1 Endianness

In the network environment, the inter-process communication is cross-host, so there is a problem that the byte order is not uniform. To solve this problem, the network protocol provides a byte order. When two processes across the host communicate, the data to be transmitted is first converted into a network byte order. After the receiver receives the data, it is converted into this Endianness of the machine. The byte sequence conversion process is as follows:

Under the linux environment, four functions are used to convert the byte order. The function prototype is as follows:

#include <arpa/inet.h>
uint32_t htonl(uint32_t hostint32);
uint16_t htons(uint16_t hostint16);
uint32_t ntohl(uint32_t netint32);
uint16_t ntohs(uint16_t netint16);

See the linux function reference manual for details . Network byte order is big-endian byte order, but the situation of the network is complicated. In order to ensure the portability of the code, no matter what kind of byte order the host must do byte conversion processing.

2.2 Address format

Each computer in the network environment has an IP address. (For the IPv4 protocol, it is a 32-bit unsigned integer; for the IPv6 protocol, it is a 128-bit unsigned integer). The in_addr structure is used in Linux to indicate an IP address, and the structure is defined as follows:

#include <netinet/in.h>
struct in_addr{
     in_addr_t s_addr;     /*in_addr_t 被定义为无符号整型*/
}

When the target machine is determined, it is also necessary to determine which process in the host needs to communicate through the port number (each process corresponds to a 16-bit port number). Therefore, in the network, an IP address and a port number can be connected to determine a process of a host. When the only two points have been confirmed, the communication begins. The definition of the address structure in Linux is as follows:

#include <netinet/in.h>

struct socketaddr_in{
     sa_family_t sin_family;     /*16位的地址族,根据套接字场合不同而不同,网络通信IPv4地址族为AF_NET*/
     in_port_t sin_port;          /*16位的端口号*/
     struct in_addr sin_addr;     /*32位的IP地址*/
     unsigned char sin_zero[8];/*填充区,8个字节填0,为保证socketaddr_in与socket_addr地址结构可以随意转换*/
}

struct socketaddr{
     sa_family_t sin_family;     /*16位的地址族,根据套接字场合不同而不同,网络通信IPv4地址族为AF_NET*/
     char sa_data[14];          /*14字节的填充区,可以看成sin_port、sin_addr、sin_zero三个成员变量组成*/
}

The structure socketaddr_in is the same length as socketaddr, so it can be easily converted to each other.

2.3 Address format conversion

The IP address is stored in the address structure in binary form, which is inconvenient to observe directly. It is intuitive to use dotted decimal (xxx.xxx.xxx.xxx). The IP address conversion functions provided under Linux are as follows:

#include <arpa/inet.h>
const char *inet_ntop(int af, const void *src,char *dst, socklen_t size);/*将二进制数转换成点分十进制*/
int inet_pton(int af, const char *src, void *dst);/*将点分十进制转换成二进制*/

See the linux function reference manual for details .  

2.4 Obtaining host information

A host and network related information is generally stored in a file in the system (such as / etc / hosts), the user can read the content on the file through the system function, use the gethostent function in Linux to read the host-related information:

#include <netdb.h>
struct hostent *gethostent(void);     /*读取含有主机相关信息的文件*/
void endhostent(void);               /*关闭含有主机相关信息的文件*/

See the linux function reference manual for details . Among them, the definition of hostent structure is as follows:

struct hostent {
    char *h_name;           /*正式主机名,每个主机只有一个*/
    char **h_aliases;     /*主机别名列表,可以有多个,以二位数组形式存储*/
    int h_addrtype;           /*IP地址类型,可以选择IPv4/IPv6*/
    int h_length;           /*IP地址长度,IPv4对应4字节的地址长度*/
    char **h_addr_list;      /*IP地址列表,h_addr_list[0]为主机的IP地址*/
};

Note: If you call gethostent twice, the buffer content pointed to by the host pointer will be washed out the first time.

2.5 Address mapping

For the user, the address structure information of the socket is unnecessary, the user only needs to pass an address of the sockaddr_in address structure, and then the system will fill in the content. The server in the network environment needs to provide a unique IP address and host name (domain name); for most servers, the client does not know its IP address, but knows its domain name. DNS can convert a domain name to an IP address. The conversion process is as follows:

The converted IP address and port number are stored in the addr_info information structure. Provide a function under Linux, that is, you can get the server's IP address and port number according to the server's domain name and service name; fill it into a sockaddr_in address structure, the function internally accesses the DNS server, so as to obtain the need to access the host IP number and port number, the function prototype is as follows:

#include <sys/types.h>
#include <sys/socket.h>
#include <netdb.h>
int getaddrinfo(const char *node, const char *service,const struct addrinfo *hints,struct addrinfo **res);

See the linux function reference manual for details .    


3 Socket basic programming

Socket technology hides most of the communication details, making operations similar to files. Because of this, many file manipulation functions can also be used on sockets. (Linux's strategy of abstracting devices into files makes programming much easier)

3.1 Create and destroy socket descriptors

The prototype of a function to create a socket and cancel a socket in the Linux environment is as follows:

#include <sys/types.h>
#include <sys/socket.h>
int socket(int domain, int type, int protocol);
#include <unistd.h>
int close(int fd);               /*关闭一个套接字和关闭一个文件是一样的操作*/

See the linux function reference manual for details .

3.2 Address binding

After creating a socket, you need to bind the socket of the address to communicate. Linux uses the bind function to bind a socket to an address. The function prototype is as follows:

#include <sys/types.h>          /* See NOTES */
#include <sys/socket.h>
int bind(int sockfd, const struct sockaddr *addr,socklen_t addrlen);

See the linux function reference manual for details . Note: The protocol in the sockaddr_in structure cannot be specified as IPv6, that is, the communication domain cannot be specified as AF_INET6. Among them, the second parameter needs to be initialized in practice, the process is as follows:

struct socketaddr_in *addr;
addr = (struct sockaddr_in *)malloc(sizeof(struct sockaddr_in));
addr->sin_family = AF_INET;               /*使用IPv4协议的地址列表*/
addr->sin_port = 8888;                    /*端口号,一般大于1024,因为只有root用户才能使用024以下的端口,通常这个端口由系统指派,因为可能被别的进程占用*/
addr->sin_addr=0x60ba8c0;               /*一般通过getaddrinfo来获取,如果希望可以接收网络中任意的数据包,则将此项设置为INADDR_ANY*/
bind(fd, (struct sockaddr_in) *addr,sizeof(struct sockaddr_in));

3.3 Establish a connection

After binding a socket, the client can establish a connection. For the service-oriented socket type, it must be specified; for connectionless services, this step is not necessary. In the Linux environment, use the connect function to establish an active connection, the function prototype is as follows:

#include <sys/types.h>          /* See NOTES */
#include <sys/socket.h>
int connect(int sockfd, const struct sockaddr *addr,socklen_t addrlen);

See the linux function reference manual for details . Note: For the network, the application must be able to handle errors that may occur during connection; there are many reasons for the failure, and if it fails, it is necessary to consider retrying, but the attempt generally requires a certain delay to ensure that the network has time to automatically restore.

The mechanism of using the connect function is shown in the figure:

When the client establishes a connection, the server must monitor and accept such a connection, and then process it. The listen function used in Linux monitors the client's connection request; use the accept function to accept a connection request. The function prototype is as follows:

#include <sys/types.h>
#include <sys/socket.h>
int listen(int sockfd, int backlog);
int accept(int sockfd, struct sockaddr *addr, socklen_t *addrlen);

See the linux function reference manual for details . Note: For socket descriptors, you cannot use the lseek function to relocate them.

3.4 Use file read and write functions to read and write sockets

The use of read / write functions in the network is prone to problems for the following reasons:

  1. Delay problem: For local folders, the delay of byte stream transmission locally can be ignored, but the transmission time in the network may be very long; therefore it will cause I / O blocking; the solution can only be non-blocking / Use multiple I / O.
  2. Network applications need to be able to handle the abnormal return of read and write operations due to interruption / network connection problems, but this will make the program more complicated and difficult to control.

Note: The cause of the error of the close function in the network environment is not the problem of the file itself, but the exception caused by the "slow output"; the write function just puts the content of the file to be cached, and the real write to the external storage It takes time, for local files, almost no errors, but in the network environment, the probability of errors is large; therefore, in the network environment, calling the write function does not guarantee that the file has reached the opposite end accurately.

3.5 Connection-oriented data transmission

It is easy to go wrong with the read / write function for network communication in Linux environment, but there are functions under Linux for connection-oriented sockets. These two functions are send and recv, and their function prototypes are as follows:

#include <sys/types.h>
#include <sys/socket.h>
ssize_t send(int sockfd, const void *buf, size_t len, int flags);
ssize_t recv(int sockfd, void *buf, size_t len, int flags);

See the linux function reference manual for details .

3.6 The simplest connection-oriented server and client process

@ 1 The server-side execution process (pseudo code) is as follows:

//地址结构初始化;
fd=socket();
bind(fd,...);
listen(fd,...);
while(1){
     accept_fd=accept(fd,...);
     //与客户端交互,处理来自客户端的请求;(recv/send);
     close(accept_fd);
}
close(fd);
close函数失败的处理。

@ 2 The client execution process (pseudo code) is as follows:

//地址结构初始化;
fd=socket();
connect(fd,...);
//与服务器交互,向服务器发出具体消息/接受来自服务器的消息;(send/recv)
close(fd);
close函数失败的处理。

note:

  1. In practice, if you don't know the server's IP address, you can use the getaddrinfo function to convert the server's domain name to the server host's IP through the DNS server; if you don't even know the domain name, you can't communicate.
  2. For a generally accessible LAN, the server and the client mostly belong to a user group, and their IP addresses are mutually visible; but in the Internet environment, the server's IP is often hidden from the client.

3.7 Oriented connectionless data transmission

The read and write functions for connectionless sockets are a bit more complicated. Since a connection is not established, the destination address of the data packet must be clearly indicated each time the data is sent; when receiving the data packet, the receiving process can get The address to send the packet. Under the Linux environment, it provides functions specifically for reading and writing connectionless sockets, which are the sendto and recvfrom functions. The function prototype is as follows:

#include <sys/types.h>
#include <sys/socket.h>
ssize_t sendto(int sockfd, const void *buf, size_t len, int flags,const struct sockaddr *dest_addr, socklen_t addrlen);
ssize_t recvfrom(int sockfd, void *buf, size_t len, int flags,struct sockaddr *src_addr, socklen_t *addrlen);

See the linux function reference manual for details .

3.8 The simplest server and client process for connectionless

@ 1 The  server-side execution process (pseudo code) is as follows:

//地址结构初始化;
fd=socket();
bind(fd,...);
while(1){
     //与客户端交互,处理来自客户端的请求;(recvfrom/sendto);
}
close(fd);
close函数失败的处理。

@ 2 The client execution process (pseudo code) is as follows:

//地址结构初始化;
fd=socket();
//与服务器交互,向服务器发出具体消息/接受来自服务器的消息;(sendto/recvfrom)
close(fd);

4 Non-blocking sockets

When the process needs to read and write to the socket, and the data of the socket is not ready, the function of reading and writing the socket will block, so that the process goes to sleep and waits, and the subsequent operations Yes, non-blocking I / O will solve this problem. Since the socket belongs to a special file, you can modify the blocking status of the socket by changing the file blocking method. The execution flow (pseudo code) on the server side is as follows:

//地址结构初始化;
fd=socket();
bind(fd,...);
     //服务器端在以往的流程上添加的3个逻辑控制语句。
     flag=fcntl(fd,F_GETFL);
     flag|=O_NONBLOCK;
     fcntl(fd,F_SETFL,flag);
listen(fd,...);
while(1){
     accept_fd=accept(fd,...);
     //与客户端交互,处理来自客户端的请求;(recv/send);
     close(accept_fd);
}
close(fd);
close函数失败的处理。

The client process of a non-blocking network application can be the same as before, or it can be made into an input block in order to verify the correctness of the server.

Published 289 original articles · praised 47 · 30,000+ views

Guess you like

Origin blog.csdn.net/vviccc/article/details/105174823