Talking about the basic principles of CGI and basic realization of the bottom

Historical origin:
Early Web servers could only respond to HTTP static resource requests from browsers and return the static resources stored in the server to the browser. With the development of Web technology, dynamic technologies have gradually appeared, but Web servers cannot directly run dynamic scripts. In order to solve the data intercommunication between Web servers and external applications (CGI programs), CGI (Common Gateway Interface) has appeared. Universal gateway interface. Simply understand, you can think of CGI as a convention for "communication" between a Web server and an application running on it.

Simply put, CGI is a kind of external logic computing extension program that performs dynamic page computing on the web server (the data exchange between the webserver and the cgi program is based on the TCP protocol).

Basic principle:
The basic idea of ​​CGI is to redirect the standard input (STDINT), standard output (STDOUT), and standard error (STDERR) to the tcp connection of the web server and the cgi external program, and then directly from the standard input and process environment variables Read the data entered by the web server and write parameters to stdout and stderror to perform standard data interaction. ( CGI is a protocol between a web server and an independent process. It sets the Header header of the HTTP request to the environment variable of the process, the body text of the HTTP request is set to the standard input of the process, and the standard output of the process is set to HTTP Response Response, including Header and Body. )

Implementation idea:
Use the two apis of dup and dup2 under Linux to copy file descriptors to achieve redirection. Only stdout and environment variables are introduced here. Only stdin and stderror have the same principle, so I repeat:

#include<unistd.h>
int dup(int oldfd);
int dup2(int oldfd,int newfd);
   当调用dup函数时,内核在进程中创建一个新的文件描述符,此描述符是当前可用文件描述符的最小数值,这个文件描述符指向oldfd所拥有的文件表项。 
  dup2和dup的区别就是可以用newfd参数指定新描述符的数值,如果newfd已经打开,则先将其关闭。如果newfd等于oldfd,则dup2返回newfd, 而不关闭它。dup2函数返回的新文件描述符同样与参数oldfd共享同一文件表项。
  APUE用另外一个种方法说明了这个问题: 
  实际上,调用dup(oldfd)等效于,fcntl(oldfd, F_DUPFD, 0) 
  而调用dup2(oldfd, newfd)等效于,close(oldfd)fcntl(oldfd, F_DUPFD, newfd)
server关键代码
 int connfd = accept(sock,(struct sockaddr*)&client,&len);
 if(connfd < 0)
   printf("errno is:%d\n",errno);
 else 
 {
    
    
    close(STDOUT_FILENO);//关闭标准输出后dup返回的就是最小值stdout_fileno
    dup(connfd);
    printf("test cgi ret!\n");
 }

client端代码

 if((connect(sock,(struct sockaddr*)&client,sizeof(client)) != -1))
 {
    
    
  char buf[30];
  printf("connect successful!\n");
  if(recv(sock,buf,sizeof(buf),0) > 0)
  	cout<<buf<<endl;
 }

运行结果:
connect successful!
test cgi ret!

Another way for cgi to obtain input is the process environment variable set by the webserver to the cgi process. In fact, the value corresponding to the environment variable is obtained through the getenv function in cgi:

while (FCGI_Accept() >= 0)
{
    
    
//如果想得到数据,需要从stdin 去读,实际上从Nginx 上去读
//如果想上传数据,需要往stdout 写,实际上是给Nginx 写数据
printf("Content-type: text/html\r\n");
printf("\r\n");
printf("<title>Fast CGI Hello!</title>");
printf("<h1>Fast CGI Hello!</h1>\n");
//请求次数
printf("Request number %d <br>\n", ++count);
//SERVER_NAME:得到server 的host 名称
printf("SERVER_NAME:%s <br>\n", getenv("SERVER_NAME"));
//SERVER_PORT:得到server 的post 名称
printf("SERVER_PORT:%s <br>\n", getenv("SERVER_PORT"));
//获取客户端参数
printf("QUERY_STRING:%s <br>\n", getenv("QUERY_STRING"));
}

Existing advantages and necessity:

Necessity: Early Web servers could only respond to HTTP static resource requests sent by the browser and return the static resources stored in the server to the browser. Dynamic calculation of the page was not possible. CGI used external extension programs. Solved the problem of http dynamic pages

Stability:
fastcgi runs cgi in a separate process pool, a single process dies, the system can easily discard it, and then redistribute a new process to run the logic

Security:
fastcgi and the host server are completely independent, no matter how fastcgi is down, the server will not be destroyed

Performance:
fastcgi separates the processing of dynamic logic from the server, and the heavy IO processing is left to the host server, so that the host server can concentrate on IO. For an ordinary dynamic web page, the logic processing may only be a small part. Static IO processing such as a large number of pictures does not require the participation of logic programs at all

Extensibility:
fastcgi is a neutral technical standard, which can fully support processing programs written in any language (php, java, python...)

Language independence:
CGI is independent of any language. CGI programs can be implemented in any scripting language or a completely independent programming language, as long as the language can run on this system. Unix shell script, Python, Ruby, PHP, perl, Tcl, C/C++, and Visual Basic can all be used to write CGI programs.

The overall basic process of the browser requesting dynamic page data:

  1. The user request is sent to the Web server via the Internet
    2) The Web server receives the user request and delivers it to the CGI program
    3) The CGI program transmits the processing result to the Web server
    4) The Web server sends the result back to the user
    Insert picture description here

Insert picture description here

Disadvantages of CGI Disadvantages: The one-process one-request processing method requires frequent creation and destruction of processes, resulting in low web server efficiency and large server system resource usage
Insert picture description here

Fastcgi improvement:
Insert picture description here
FastCGI
fast common gateway interface (FastCommonGatewayInterface/FastCGI) is an improvement of common gateway interface (CGI), which describes a standard for data transmission between client and server programs. FastCGI is committed to reducing the cost of interaction between the Web server and CGI programs, so that the server can handle more Web requests at the same time. Unlike creating a new process for each request, FastCGI uses a continuous process to handle a series of requests. These processes are managed by the FastCGI process manager, not the web server.

What is FastCGI?
FastCGI is a language-independent, scalable architecture CGI open extension. Its main behavior is to keep the CGI interpreter process in memory for management and scheduling, so it can achieve higher performance.

FastCGI workflow
Insert picture description here

  1. Load FastCGI process manager when WebServer starts
  2. The FastCGI process manager initializes itself, starts multiple CGI interpreter processes and waits for the connection from the WebServer
  3. When a client request arrives at the WebServer, the FastCGI process manager selects and connects to a CGI interpreter. 4) After the FastCGI sub-process finishes processing, the standard output and error messages are returned to the WebServer from the same connection

FastCGI Protocol
In FastCGI, each HTTP request (or response) message is divided into several records (Record) for transmission, and each Record is composed of a header (Header) and data (Body).
The benefits of using Record for message passing:
• Multiple requested data can be reused in the same connection for transmission, so that application implementation can adopt event-driven programming model or multi-threaded programming model to improve efficiency;
• In the same request The data of multiple data streams can be encapsulated into different records and transmitted on the same connection. For example, the data of the two output streams of STDOUT and STDERR can be returned to the Web server through the same connection instead of having to use two connections.

Record 结构体
typedef struct {
    
     /* Header */
unsigned char version; // FastCGI版本 
unsigned char type; // 当前 Record 的类型 
unsigned char requestIdB1; // 当前 Record对应的请求 id (通过 requestId 来识别请求) 
unsigned char requestIdB0; 
unsigned char contentLengthB1; // 当前 Record 中 Body 体数据的长度 
unsigned char contentLengthB0; 
unsigned char paddingLength; // Body 中填充块的长度 
unsigned char reserved; /* Body */ 
unsigned char contentData[contentLength]; // Body 体数据 
unsigned char paddingData[paddingLength]; // Body 中填充块长度 
} FCGI_Record;

FastCGI protocol type
FastCGI is a binary continuous transmission, defines a unified structure of the message header, used to read the message body of each message, to facilitate the cutting of message packets. In general, the FCGI_BEGIN_REQUEST type message is sent first, followed by the FCGI_PARAMS and FCGI_STDIN type messages. When the FastCGI response is processed, the FCGI_STDOUT and FCGI_STDERR type messages will be sent, and finally FCGI_END_REQUEST indicates the end of the request. FCGI_BEGIN_REQUEST and FCGI_END_REQUEST represent the beginning and end of the request respectively, and are related to the entire agreement

#define FCGI_BEGIN_REQUEST 1 //(web->fastcgi)请求开始数据包 
#define FCGI_ABORT_REQUEST 2 //(web->fastcgi)终止请求 
#define FCGI_END_REQUEST 3 //(fastcgi->web)请求结束 
#define FCGI_PARAMS 4 //(web->fastcgi)传递参数 
#define FCGI_STDIN 5 //(web->fastcgi)数据流传输数据 
#define FCGI_STDOUT 6 //(fastcgi->web)数据流传输数据 
#define FCGI_STDERR 7 //(fastcgi->web)数据流传输 
#define FCGI_DATA 8 //(web->fastcgi)数据流传输 
#define FCGI_GET_VALUES 9 //(web->fastcgi)查询 fastcgi 服务器性能参数 
#define FCGI_GET_VALUES_RESULT 10 //(fastcgi->web)fastcgi 性能参数查询返回 #define FCGI_UNKNOWN_TYPE 11 
#define FCGI_MAXTYPE (FCGI_UNKNOWN_TYPE)

**When the Web server sends a FastCGI request: **Send 3 types of Records in turn, the types are BEGIN_REQUEST, PARAMS and STDIN FastCGI

**When the process returns a FastCGI response: **Returns 3 types of Records in turn, the types are STDOUT, STDERR, END_REQUEST

FastCGI request delivery process:

Insert picture description here
The message types transmitted by the web server to the FastCGI program are as follows:
FCGI_BEGIN_REQUEST indicates the start of a request,
FCGI_ABORT_REQUEST indicates that the server wishes to terminate a request
FCGI_PARAMS corresponds to the environment variables of the CGI program, and most of the data in the php$_SERVER array comes from This FCGI_STDIN corresponds to the standard input of the CGI program. The
FastCGI program obtains the POST data of the http request from this message. In addition,
FCGI_DATA and FCGI_GET_VALUES are not introduced here.

The message types returned by the FastCGI program to the web server are as follows:
FCGI_STDOUT corresponds to the standard output of the CGI program, the web server will return this message as html to the browser
FCGI_STDERR corresponding to the standard error output of the CGI program, and the web server will output this The message is recorded in the error log.
FCGI_END_REQUEST indicates that the request is processed.
FCGI_UNKNOWN_TYPE The FastCGI program cannot parse the message type. In addition, there is also FCGI_GET_VALUES_RESULT, which will not be introduced here.

FastCGI data packet format
FastCGI data packet has two parts, the header (header) and the body (body). Each data packet must contain a header, and the body can be absent. The header is 8 bytes, and the body must be an integral multiple of 8. If it is not, it needs to be filled.

typedef struct {
    
     
unsigned char version; // 版本号 
unsigned char type; // 数据包类型 
unsigned char requestIdB1; // 记录 id 高 8 位 
unsigned char requestIdB0; // 记录 id 低 8 位 
unsigned char contentLengthB1; // 记录内容长度高 8 位(body 长度高 8 位) 
unsigned char contentLengthB0; // 记录内容长度低 8 位(body 长度低 8 位) 
unsigned char paddingLength; // 补齐位长度(body 补齐长度) 
unsigned char reserved; // 补齐位 
}FCGI_Header;


The structure of the contentData data part of the FCGI_BEGIN_REQUEST type record of the packet body :

typedef struct {
    
     unsigned char roleB1; unsigned char roleB0; unsigned char flags; unsigned char reserved[5]; } FCGI_BeginRequestBody;
typedef struct {
    
     FCGI_Header header; FCGI_BeginRequestBody body; } FCGI_BeginRequestRecord;

The structure of the contentData data part of the FCGI_END_REQUEST type record:

typedef struct {
    
     
unsigned char appStatusB3; 
unsigned char appStatusB2; 
unsigned char appStatusB1; 
unsigned char appStatusB0; 
unsigned char protocolStatus; 
unsigned char reserved[3]; 
} FCGI_EndRequestBody;

typedef struct {
    
     
FCGI_Header header; 
FCGI_EndRequestBody body; 
} FCGI_EndRequestRecord;
typedef struct {
    
     
unsigned char type; 
unsigned char reserved[7]; 
} FCGI_UnknownTypeBody;

typedef struct {
    
     
FCGI_Header header; 
FCGI_UnknownTypeBody body; 
} FCGI_UnknownTypeRecord;

FastCGI configuration parameters and environment variables
FastCGI configuration parameters

Nginx is based on the module ngx_http_fastcgi_module to forward the specified client request to spawn-fcgi for processing through the fastcgi protocol.

//转发请求到后端服务器,address 为后端的 fastcgi server 的地址,可用位置:location, if in location。 
fastcgi_pass address; 
//fastcgi默认的主页资源,示例:fastcgi_index index.html;这个功能和 index index.html 功能一样。 
fastcgi_index name; 
//设置传递给 FastCGI 服务器的参数值,可以是文本,变量或组合,可用于将 Nginx 的内置变量赋值给自定义 key fastcgi_param parameter value [if_not_empty]; //调用指定的缓存空间来缓存数据,可用位置:http, server, location;zone 的值为 keys_zone 定义好的缓存名称。 
fastcgi_cache zone|off; //定义用作缓存项的 key 的字符串,示例:fastcgi_cache_key $request_uri; 针对用户请求的 uri 进行缓存。 
fastcgi_cache_key string; 
//为哪些请求方法使用缓存。 
fastcgi_cache_methods GET|HEAD|POST...; 
//缓存空间中的缓存项在 inactive 定义的非活动时间内至少要被访问到此处所指定的次数方可被认作活动项,如果不够命中所指定的缓存次数为非活动项,会将被从缓存中清除。 fastcgi_cache_min_uses number; 
//收到后端服务器响应后,fastcgi 服务器是否关闭连接,建议启用长连接。这个功能需要开启,spawn 处理完这次请 求后,如果和 nginx 断开链接,下次用户再次发起请求,nginx 还需要和 spawn 先三次握手建立连接;开启之后,nginx 和 spawn 不断开连接,省略了建立连接的过程。 fastcgi_keep_conn on|off; 
//不同的响应码各自的缓存时长;200 的缓存时长需要长一些,404 的缓存时长短一些。 fastcgi_cache_valid 200 10m; 
//隐藏后端 spawn 服务器的响应头指定信息。 
fastcgi_hide_header field;

FastCGI environment variables

SCRIPT_FILENAME $document_root$fastcgi_script_name;#脚本文件请求的路径 QUERY_STRING $query_string; #请求的参数;?app=123 
REQUEST_METHOD $request_method; #请求的动作(GET,POST) 
CONTENT_TYPE $content_type; #请求头中的 Content-Type 字段 
CONTENT_LENGTH $content_length; #请求头中的 Content-length 字段。
SCRIPT_NAME $fastcgi_script_name; #脚本名称 
REQUEST_URI $request_uri; #请求的地址不带参数 
DOCUMENT_URI $document_uri; #与$uri 相同。 
DOCUMENT_ROOT $document_root; #网站的根目录。在 server配置中 root 指令中指定的值 SERVER_PROTOCOL $server_protocol; #请求使用的协议,通常是 HTTP/1.0 或 HTTP/1.1。 GATEWAY_INTERFACE CGI/1.1;#cgi 版本 
SERVER_SOFTWARE nginx/$nginx_version;#nginx 版本号,可修改、隐藏 
REMOTE_ADDR $remote_addr; #客户端 IP REMOTE_PORT $remote_port; #客户端端口 SERVER_ADDR $server_addr; #服务器 IP 地址 SERVER_PORT $server_port; #服务器端口 SERVER_NAME $server_name; #服务器名,域名在 server 配置中指定的 server_name PATH_INFO $path_info;#可自定义变量

5. FastCGI process manager
5.1 FastCGI process manager
fastcgi can use spawn-fcgi or php-fpm to manage (fastcgi process manager, there are many different types) fastcgi and the server are separated under nginx
5.2spawn-fcgi
5.2.1spawn What does -fcgi do?

• It's quite a proxy tool
• The role completes the inter-process communication between nginx and fastcgi
5.2.2 Environment configuration
• Instructions that nginx can't handle are handed over to fastcgi for processing
• Data needs to be forwarded
• Data needs to be sent to a designated port
• Process an instruction test
• url:http://192.168.52.139/test

location /test{
    
    
 #配置 fastcgi 模块 
 fastcgi_pass 127.0.0.1:9001; 
 	#IP: 
		#127.0.0.1/localhost/192.168.52.139 
 	#端口: 
 		#将要处理的数据发送到 9001 端口 
 	#9001 端口对应一个进程, 该进程可以收到 nginx 发送过来的数据 
 include fastcgi.conf; 
 }

5.2.3 Use of spawn-fcgi
○ Write a program name test compiled by fcgi program
○ spawn-fcgi-aIP-p port-ffastcgi program parameter description:
-a IP: server IP address
-p port: server sends data to Port-
f cgi program: the executable fastcgi program started by spawn-fcgi

5.2.4 The relationship between fastCGI protocol, Spawn-fcgi, and Nginx
Nginx is a web server and only provides input and output of HTTP protocol.
The spawn-fcgi server only supports the input and output of the Fastcgi protocol.
The two of them are directly converted from the HTTP protocol to the Fastcgi protocol by Nginx and transmitted to the fastCGI process for processing

FastCGI, Spawn-fcgi, and Nginx are related to
Insert picture description here
user request process
Insert picture description here
6. http_fastcgi_module
nginx service supports FastCGI mode, which can handle dynamic requests quickly and efficiently. The FastCGI module corresponding to nginx is ngx_http_fastcgi_module.

The ngx_http_fastcgi_module module allows the request to be passed to the FastCGI server. Convert http protocol to fastcgi protocol

6.1ngx_http_fastcgi_module module configuration

//address是 fastcgi server 监听的 IP 地址和端口; 
//示例:fastcgi_pass 127.0.0.1:9000; 
1. fastcgi_pass address;:指明反向代理的服务器
//示例:fastcgi_index index.php; 
2.fastcgi_index# ;:定义 fastcgi 应用的默认主页; 
//设定传递给后端 fastcgi server 参数及其值; //示例:fastcgi_param SCRIPT_FILENAME /scripts$fastcgi_script_name;
///index.php ---> /scripts/index.php
3.fastcgi_paramparameter value [if_not_empty];//定义缓存:缓存空间等;应用于 http 配置段。 
//------------------------------------------------------------------------------------------------| 
//--|path | 数据缓存在磁盘中位 置 | 
//--|levels=#[:#[:#]] | 定义的目录级别,levels=2:1表示两位十六进制字符命名目录,每个目录中还有目 录 | 
//--|keys_zone=name:size| 元数据缓存在内存中;name;cache 的标识符;size:元数据 cache 大 小; | 
//--|inactive=time | 缓存的非活动时 间 | 
//--|max_size | 缓存空间上 限 | 
//-----------------------------------------------------------------------------------------------4.fastcgi_cache_pathpath [levels=levels][use_temp_path=on|off] keys_zone=name:size [inactive=t ime][max_size=size];
//调用定义过的缓存; //zone即为通过 fastcgi_cache_path 定义缓存时其 keys_zone参数中的 name; 
3. fastcgi_cachezone | off;
//定义如何使用缓存键; 
//示例:fastcgi_cache_key $request_uri; 
4. fastcgi_cache_keystring;
//为何请求方法对应的请求进行缓存,默认为 GET 和 HEAD; 
5. fastcgi_cache_methods GET | HEAD | POST ...;
//缓存项的最少使用次数; 
6. fastcgi_cache_min_usesnumber;
//是否可使用 stale 缓存项响应用户请求; 
7. fastcgi_cache_use_staleerror | timeout | invalid_header | updating | http_500 | http_503 | http_403 |http_404 | off ...;
//对不同响应码的响应设定其可缓存时长; 10.fastcgi_cache_valid[code ...] time;

6.2ngx_http_fastcgi_module More information
More information: http://nginx.org/en/docs/http/ngx_http_fastcgi_module.html
6.3nginx configuration
○ Add location location /group1/M00

location /group1/M00 {
    
     
	///home/milo/fastdfs/storage/fastdfs0/data - fastDFS 的 storage 存储数据的真实目录 
	root /home/milo/fastdfs/storage/fastdfs0/data; 
	//fastdfs 模块的名字 
	ngx_fastdfs_module; 
	}

○ Reload the configuration file of nginx

6.3 mod_fastdfs.conf configuration file
must refer to the current storage node storage configuration file to modify

○ log 日志目录 § base_path=/home/milo/fastDFS/storage 
○ 追踪器的地址 § tracker_server=192.168.52.139:22122 
○ 当前存储节点的端口 § storage_server_port=23000 
○ 当前存储节点所属的组 § group_name=group1 
○ 浏览器访问的时候, url 中是否包含组名 § url_have_group_name = true 
○ 当前存储节点存储路径的个数 § store_path_count=1 
○ 当前存储节点的存储路径 
	§ store_path0=/home/milo/fastDFS/storage/fastdfs0 
		□ 如果有多个, 需要全部写到配置文件中 
		® store_path1 
		® store_path2 
○ 整个的 fastDFS 文件系统一共有多少个组 § group_count = 1 
○ 每个组的信息 [group1] group_name=group1 storage_server_port=23000
store_path_count=1 
store_path0=/home/milo/fastDFS/storage/fastdfs0

Guess you like

Origin blog.csdn.net/wangrenhaioylj/article/details/108973076