WSGI and gunicorn refer to the north (1)

As a Python web development engineer, pyg0 happily writes business codes based on various web frameworks every day.

Suddenly one day, the technical boss came over and told pyg0, hey, we want to launch a new service, you can help me deploy it. Don't get too complicated. Run flask with gunicorn, start 8 processes, and run it in gevent mode. This is a good match, give you an hour.

At this time, pyg0's heart is full of black question marks, gunicorn, gevent? ? ? what the hell?

I quickly consulted the almighty google, and found that the online configuration tutorials are flying all over the sky. pyg0 found one that looks pleasing to the eye and draws a tiger according to the cat, and finally got it before get off work. (What about an hour?)

But is it really done? pyg0 actually found itself even more confused. During the reconfiguration process, more nouns appeared in his field of vision, wgsi? uwsgi? master? worker? 

To keep myself from being confused. pyg0 decided to start a journey to learn python web server

 

a WSGI

WSGI (Web Server Gateway Interface), Chinese is the web server gateway interface. It is not a web server, nor a web application, but a protocol and specification between the web server and the web application.

The purpose of the WSGI specification is to decouple Web Server and Web Application. A complete WSGI protocol includes two parts, server and application. The role of the server is to accept the request from the client, forward it to the application, and then send the response returned by the application to the client. The application is used to accept the request sent by the server, process the request, and then send the response back to the server. In this way, we can have multiple Web Servers that implement the server protocol and multiple web frameworks that implement the application.

The point is, the gunicorn mentioned above, uwsgi is a web server that implements the WSGI server protocol. And our commonly used Django, Flask, etc. are web frameworks that implement the WSGI application protocol. In this way, we can freely combine web server and web framework like building blocks.

 

In fact, python itself brings a server and application that implement the WSGI protocol, and each web framework basically has its own server, but these servers are basically only used for debugging, not for production environments. Let's first look at the WSGI protocol implemented by python itself: wsgiref. The name clearly tells us that I'm just a reference, brother, just take a look at it, don't use it, the performance is not guaranteed.

1 #coding:utf-8
2 from wsgiref.simple_server import make_server, demo_app
3 
4 app = demo_app
5 server = make_server("127.0.0.1", 9000, app)
6 server.serve_forever()

demo_app is a simple web application, let's see what he has done

def demo_app(environ,start_response):
    from io import StringIO
    stdout = StringIO()
    print("Hello world!", file=stdout)
    print(file=stdout)
    h = sorted(environ.items())
    for k,v in h:
        print(k,'=',repr(v), file=stdout)
    start_response("200 OK", [('Content-Type','text/plain; charset=utf-8')])
    return [stdout.getvalue().encode("utf-8")]

It perfectly meets the WSGI standard and accepts two parameters, environ and start_response. environ is a dict that includes all the client's request information and related information, and start_response is a callback function that can send response status and response headers. In addition, our web application prints out all the information in the environ.

Now comes the question. How should we call this web application? The answer is that we don't need to tune it, we rely on the web server to tune it. Let's see what the webserver in wsgiref looks like

class WSGIServer(HTTPServer):

    """BaseHTTPServer that implements the Python WSGI protocol"""

    application = None

    def server_bind(self):
        """Override server_bind to store the server name."""
        HTTPServer.server_bind(self)
        self.setup_environ()

    def setup_environ(self):
        # Set up base environment
        env = self.base_environ = {}
        env['SERVER_NAME'] = self.server_name
        env['GATEWAY_INTERFACE'] = 'CGI/1.1'
        env['SERVER_PORT'] = str(self.server_port)
        env['REMOTE_HOST']=''
        env['CONTENT_LENGTH']=''
        env['SCRIPT_NAME'] = ''

    def get_app(self):
        return self.application

    def set_app(self,application):
        self.application = application

The web server is also very simple. Inherited HTTPServer, rewritten the server_bind method, and created the necessary environment variables while binding. It also provides methods for binding and obtaining web applications.

 

Let's take a look at WSGIRequestHandler again

class WSGIRequestHandler(BaseHTTPRequestHandler):

    server_version = "WSGIServer/" + __version__

    def get_environ(self):
        env = self.server.base_environ.copy()
        env['SERVER_PROTOCOL'] = self.request_version
        env['SERVER_SOFTWARE'] = self.server_version
        env['REQUEST_METHOD'] = self.command
        if '?' in self.path:
            path,query = self.path.split('?',1)
        else:
            path,query = self.path,''

        env['PATH_INFO'] = urllib.parse.unquote(path, 'iso-8859-1')
        env['QUERY_STRING'] = query

        host = self.address_string()
        if host != self.client_address[0]:
            env['REMOTE_HOST'] = host
        env['REMOTE_ADDR'] = self.client_address[0]

        if self.headers.get('content-type') is None:
            env['CONTENT_TYPE'] = self.headers.get_content_type()
        else:
            env['CONTENT_TYPE'] = self.headers['content-type']

        length = self.headers.get('content-length')
        if length:
            env['CONTENT_LENGTH'] = length

        for k, v in self.headers.items():
            k=k.replace('-','_').upper(); v=v.strip()
            if k in env:
                continue                    # skip content length, type,etc.
            if 'HTTP_'+k in env:
                env['HTTP_'+k] += ','+v     # comma-separate multiple headers
            else:
                env['HTTP_'+k] = v
        return env

    def get_stderr(self):
        return sys.stderr

    def handle(self):
        """Handle a single HTTP request"""

        self.raw_requestline = self.rfile.readline(65537)
        if len(self.raw_requestline) > 65536:
            self.requestline = ''
            self.request_version = ''
            self.command = ''
            self.send_error(414)
            return

        if not self.parse_request(): # An error code has been sent, just exit
            return

        handler = ServerHandler(
            self.rfile, self.wfile, self.get_stderr(), self.get_environ()
        )
        handler.request_handler = self      # backpointer for logging
        handler.run(self.server.get_app())

Our WSGIRequestHandler will add more request related information into the environ. At the same time, the handle method is rewritten. Here we see the familiar get_app(), yes, it is here that our web_application appears. It will accept the server's environ and the callback function start_response. After the processing is completed, it will send the HTTP Code and HEADER to the handler through start_response, and return the http response to the handler through return.

With this simple example, pyg0 finally understands what WSGI is all about. But he also agrees that this wsgiref is too low. So hurry up and start studying the gunicorn recommended by the boss.

See you next time!

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325071844&siteId=291194637