How to understand Web Server Gateway Interface 07/12 Update SLTechnology News&Howtos

How to understand Web Server Gateway Interface

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article introduces you how to understand the Web server gateway interface, the content is very detailed, interested friends can refer to, hope to be helpful to you.

Of course, the Python community also needs such a set of API to adapt to Web servers and applications, this set of API is WSGI (Python Web Server Gateway Interface), which is described in detail in PEP 3333. To put it simply, WSGI is the bridge between the Web server and the Web application. On the one hand, it gets the original HTTP data from Web server, processes it into a unified format and gives it to the Web application, on the other hand, it processes the business logic from the application / framework, generates the response content and gives it to the server.

The detailed process of coupling the Web server and framework through WSGI is shown in the following figure:

WSGI Server adaptation

The specific explanation is as follows:

The application (network framework) provides a callable object named application (the WSGI protocol does not specify how to implement this object).

Each time the server receives a request from the HTTP client, it invokes the callable object application, passing a dictionary called environ as a parameter, and a callable object named start_response.

The framework / application generates the HTTP status code and the HTTP response header, then passes them to start_response and waits for the server to save them. In addition, the framework / application will return the body of the response.

The server combines the status code, response header, and response body into a HTTP response and returns it to the client (this step is not part of the WSGI protocol).

Let's take a look at how WSGI adapts from the server side and the application side, respectively.

Server side

We know that each HTTP request issued by the client (usually the browser) consists of three parts: the request line, the message header, and the request body, which contains the relevant details of the request. For example:

Method: indicates the methods executed on the resources identified by the Request-URI, including GET,POST, etc.

User-Agent: allows the client to tell the server its operating system, browser, and other properties

After the server receives the HTTP request from the client, the WSGI interface must unify these request fields so that they can be easily passed to the application server interface (actually to the framework). The specific data that the Web server transmits to the application has been specified in CGI (Common Gateway Interface, Universal Gateway Interface). This data is called CGI environment variable. WSGI follows the contents of the CGI environment variable, requiring the Web server to create a dictionary to hold these environment variables (commonly named environ). In addition to the variables defined by CGI, environ must also save some variables defined by WSGI. In addition, you can also save some environment variables of the client system. You can refer to environ Variables to see which variables are there.

Then the WSGI interface must hand over the environ to the application, where WSGI specifies that the application provides a callable object application, and then the server calls application and gets the return value as the HTTP response body. When the server calls application, it needs to provide two variables, one is the previously mentioned variable dictionary environ, and the other is the callable object start_response, which generates the status code and response header, so we get a complete HTTP response. The Web server returns the response to the client, and a complete HTTP request-response process is completed.

Wsgiref analysis

Python has a built-in Web server that implements the WSGI interface. In the module wsgiref, it is a reference implementation of a WSGI server written in pure Python. Let's briefly analyze its implementation. First, suppose we start a Web server with the following code:

# Instantiate the server httpd = make_server ('localhost', # The host name 8051, # A port number where to wait for the request application # The application object name, in this case a function) # Wait for a single request, serve it and quit httpd.handle_request ()

Then we use the Web server to receive a request, generate environ, and then call the main line of application to process the request to analyze the calling process of the source code, which is simplified as shown in the following figure:

WSGI Server call process

There are three main classes here, WSGIServer,WSGIRequestHandler,ServerHandle. WSGIServer is a Web server class that provides server_address (IP:Port) and WSGIRequestHandler classes for initialization to get a server object. The object listens to the port of the response, and after receiving the HTTP request, an instance of the RequestHandler class is created through finish_request. During the initialization of the instance, an instance of the Handle class is generated, and then its run (application) function is called. In this function, the application object provided by the application is called to generate the response.

The inheritance relationships of these three classes are shown in the following figure:

WSGI class inheritance diagram

Among them, TCPServer uses socket to complete TCP communication, and HTTPServer is used to do HTTP level processing. Similarly, StreamRequestHandler to deal with stream socket,BaseHTTPRequestHandler is used to deal with the content of the HTTP level, this part has little to do with the WSGI interface, more is the specific implementation of Web server, can be ignored.

Micro server instance

If the wsgiref above is too complex, let's implement a tiny Web server so that we can understand the implementation of the WSGI interface on the Web server side. The code is extracted from the do-it-yourself web server (2) and placed on gist. The main structure is as follows:

Class WSGIServer (object): # socket parameters address_family, socket_type = socket.AF_INET, socket.SOCK_STREAM request_queue_size = 1 def _ init__ (self, server_address): # TCP server initialization: create socket, bind address, listen port # get server address Port def set_app (self, application): # get application self.application provided by framework = application def serve_forever (self): # handle TCP connection: get request content, call processing function def handle_request (self): # parse HTTP request, get environ, process request content Return HTTP response result env = self.get_environ () result = self.application (env, self.start_response) self.finish_response (result) def parse_request (self, text): # parse HTTP request def get_environ (self): # analyze environ parameters. This is just an example. There are many parameters in the actual situation. Env ['wsgi.url_scheme'] =' http'... Env ['REQUEST_METHOD'] = self.request_method # GET... Return env def start_response (self, status, response_headers, exc_info=None): # add a response header Status code self.headers_set = [status, response_headers + server_headers] def finish_response (self, result): # return HTTP response information SERVER_ADDRESS = (HOST, PORT) =', 8888 # create a server instance def make_server (server_address, application): server = WSGIServer (server_address) server.set_app (application) return server

At present, there are many mature Web servers that support WSGI, and Gunicorn is a pretty good one. It was born in the Unicorn of the ruby community and was successfully transplanted to python to become a WSGI HTTP Server. It has the following advantages:

Easy to configure

Multiple worker processes can be automatically managed

Select different backend extension interfaces (sync, gevent, tornado, etc.)

Application side (framework)

Compared with the server side, what the application side (or framework) needs to do is much simpler, it only needs to provide a callable object (commonly named application), which receives the two parameters passed on the server side, environ and start_response. The callable object here can be not only a function, but also a class (the second example below) or an instance of the _ _ call__ method, as long as the above two parameters are acceptable and the return value can be iterated by the server.

What Application needs to do is to do some business processing according to the information about HTTP request provided in environ and return an iterable object. The server side iterates through this object to get the text of HTTP response. If there is no response body, you can return None.

At the same time, application also calls the start_response provided by the server to generate the status code and response header of the HTTP response. The prototype is as follows:

Def start_response (self, status, headers,exc_info=None):

Application needs to provide status: a string representing the HTTP response status string, and response_headers: a list containing tuples of the following form: (header_name, header_value), which represents the headers of the HTTP response. At the same time, exc_info is optional for the information that server needs to return to the browser in the event of an error.

At this point, we can implement a simple application, as follows:

Def simple_app (environ, start_response): "Simplest possible application function" HELLO_WORLD = "Hello world!\ n" status = '200 OK' response_headers = [(' Content-type', 'text/plain')] start_response (status, response_headers) return [HELLO_WORLD]

Or use a class to implement as follows.

Class AppClass: "" Produce the same output, but using a class "" def _ _ init__ (self, environ, start_response): self.environ = environ self.start = start_response def _ _ iter__ (self):. HELLO_WORLD = "Hello world!\ n" yield HELLO_WORLD

Note that the AppClass class itself is application, which is called (instantiated) with environ and start_response to return an instance object that itself is iterable and meets the application requirements of WSGI.

If you want to use an object of the AppClass class as an application, you must add a _ _ call__ method to the class, accept environ and start_response as parameters, and return an iterable object, as shown below:

Class AppClass: "Produce the same output, but using an object" def _ _ call__ (self, environ, start_response):

This section covers some of the advanced features of python, such as yield and magic method, which can be understood by referring to my summary of python language points.

WSGI in Flask

Flask is a lightweight Python Web framework that meets the requirements of the WSGI specification. Its original version is only more than 600 lines, which is relatively easy to understand. Let's take a look at the section about the WSGI interface in its original version.

Def wsgi_app (self, environ, start_response): "" The actual WSGI application. This is not implemented in `_ _ call__ `so that middlewares can be applied: app.wsgi_app = MyMiddleware (app.wsgi_app) "" with self.request_context (environ): rv = self.preprocess_request () if rv is None: rv = self.dispatch_request () response = self.make_response (rv) response = self.process_response (response) return response (environ) Start_response) def _ call__ (self, environ, start_response): "Shortcut for: attr: `wsgi _ app`" return self.wsgi_app (environ, start_response)

The wsgi_app here implements what we call the application function, rv is the encapsulation of the request, and response is the specific function that the framework uses to deal with business logic. The flask source code is not explained too much here, those who are interested can go to github to download, and then check to the original version to check.

Middle ware

The previous comment on the wsgi_app function in the flask code mentioned that the application part is not implemented directly in _ _ call__ so that middleware can be used. So why use middleware, and what is middleware?

Looking back at the previous application/server-side interface, for a HTTP request, the server side always calls an application to process it and returns the result of application processing. This is enough for general scenarios, but it is not perfect. Consider the following application scenarios:

For different requests (such as different URL), server needs to call different application, so how to choose which one to call?

In order to do load balancing or remote processing, you need to use application running on other hosts on the network for processing.

The content returned by application needs to be processed before it can be used as a HTTP response.

What these scenarios have in common is that there are some necessary operations that are not appropriate either on the server side or on the application (framework) side. For the application side, these operations should be done by the server side, and for the server side, these operations should be done by the application side. In order to deal with this situation, middleware is introduced.

Middleware is like a bridge between the application side and the server side to communicate between the two sides. For the server side, the middleware behaves like the application side, and for the application side, it behaves like the server side. As shown in the following figure:

Middle ware

Implementation of Middleware

The flask framework uses middleware in the initialization code of the Flask class:

Self.wsgi_app = SharedDataMiddleware (self.wsgi_app, {self.static_path: target})

The function here is the same as the decorator in python, which is to execute some of the contents of the SharedDataMiddleware before and after executing the self.wsgi_app. What middleware does is very similar to what decorators do in python. SharedDataMiddleware middleware is provided by the werkzeug library to support site hosting of static content. In addition, there is DispatcherMiddleware middleware, which is used to support the call of different application according to different requests, which can solve the problems in previous scenarios 1 and 2.

Let's look at the implementation of DispatcherMiddleware:

Class DispatcherMiddleware (object): "" Allows one to mount middlewares or applications in a WSGI application. This is useful if you want to combine multiple WSGI applications:: app = DispatcherMiddleware (app, {'/ app2': app2,'/ app3': app3}) "" def _ init__ (self, app, mounts=None): self.app = app self.mounts = mounts or {} def _ call__ (self, environ) Start_response): script = environ.get ('PATH_INFO',') path_info =''while' /'in script: if script in self.mounts: app = self.mounts [script] break script, last_item = script.rsplit ('/', 1) path_info ='/% s% s'% (last_item Path_info) else: app = self.mounts.get (script, self.app) original_script_name = environ.get ('SCRIPT_NAME','') environ ['SCRIPT_NAME'] = original_script_name + script environ [' PATH_INFO'] = path_info return app (environ, start_response)

When initializing the middleware, you need to provide a mounts dictionary to specify the mapping of different URL paths to application. So for a request, the middleware checks its path, and then selects the appropriate application to process.

The principle of WSGI is basically over, and in the next article I will introduce my understanding of the flask framework.

On how to understand the Web server gateway interface to share here, I hope that the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.