What is the basic application of http? 07/04 Update SLTechnology News&Howtos

What is the basic application of http?

2025-07-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article introduces you how the basic application of http is, the content is very detailed, interested friends can use it for reference, I hope it can be helpful to you.

Introduction of http Protocol

Http:Hyper Text Transfer Protocol Hypertext transfer Protocol is one of the most widely used network protocols on the Internet, which is mainly used for Web services. The text information is processed by computer and the format is HTML (Hyper Text Mark Language) hypertext markup language.

Version of the http protocol

Http 0.9: transfer html documents only to users

Http 1.0

MIME (Multipurpose Internet Mail Extesions) mechanism is introduced: multi-purpose Internet mail extension. After introducing this technology, http can send multimedia (such as video, audio, etc.) messages. This mechanism allows http not only to support html format, but also to support other formats for sending.

The keep-alive mechanism is introduced to support the function of persistent connections (but this keep-alive principle is formed by adding a field in the beginning, not natively).

Introduction of caching support

Http 1.1

Support for more request methods, more fine-grained cache control, and native support for persistent connections (presistent).

Http 2.0

Provides HTTP semantically optimized transport

Spdy: a technology introduced by google to accelerate the exchange of http data, especially using the ssl acceleration mechanism, but spdy doesn't use much yet.

At present, the commonly used versions are http 1.0 and http 1.1.

Html text introduction

Html text schema

TITLE H1

ToGoogle

The way html documents are generated

Static state

Edited and defined in advance

Dynamic

Output the result in html format after compiling the program written in the language

Dynamic languages are: php,jsp,asp,.net

Note: these scripts must have corresponding interpreters, for example, php needs to have a php interpreter and so on.

Static and dynamic approach

Static state

1. The Web server registers socket with the kernel

2. The client initiates a request request to the Web server through the browser

3. The Web server receives the request information from the client

4. If the resource requested by the user is local to the server, the http service will apply for a call from the system kernel

5. The kernel calls the data on the local disk and sends the data to the http service

6. Http sends the resources requested by the user through the response message, and finally responds to the client

Dynamic

Different from the static, if the user requests dynamic content, then the http service will call the back-end parser, and the dynamic language will process the user's request. If you need to request data, it will apply to the kernel to call and obtain the user-specified data from the disk. Run through the interpreter, the running result will usually generate a file in html format. Then the response message is built and finally sent back to the client.

Http protocol

Http protocol message

There are many lines in the HTTP message, which are generally composed of ASCII code strings, and the length of each field is uncertain. HTTP messages can be divided into two types: request message and response message.

Request Message (request message)

Client-→ server

The client sends a request to the server, and different websites are used to request different resources (html documents)

Response Message (response message)

Server-→ client

It is the server that responds to the client's request

Introduction to the format of request message

Request line + request header + blank line + request entity

What is the method of this request, that is, the method of request

Which resource is requested and which URL. Can be a relative path, such as / images/log.jpg, or an absolute path, such as http://www.magedu.com/images.banner.jpg

What is the requested protocol version, http protocol version, format HTTP/., for example: HTTP/1.0,HTTP/1.1 header, there may be more than one header. All kinds of first information that can be used.

Request entity, what exactly is the content of your request?

Request line

It is composed of request method field + request URL field + HTTP protocol version, which is used to identify the request method, the requested resource, and the requested protocol version, which are directly separated by "spaces"!

Request header

It is composed of keyword + keyword value, separated by ":" and formatted as Name:Value. The function of the request header is to inform the server side of the request content through the client, and there can be more than one header.

Blank line

There will be a blank line after the request header, which is used to inform the server that the request header information will not appear again by sending carriage return characters and newline characters.

Request entity

What exactly do you need to ask for?

For example:

# this must be a blank line

Introduction to the format of response message

Start line + response header + blank line + response entity

The server needs to respond to what version is requested by the client when responding.

What is the status code of the request? 202403, etc.

What is the information of the status code of the response, the reason phrase, the meaning of the response of the status code, readable information

A lot of response heads.

Response body

Start line

Also known as the status line, it is used by the server to respond to client requests. It consists of version number, status code and reason phrase, such as "HTTP/1.1 200 OK"

Response header

Similar to a request message, there are usually several header fields after the starting line. Each header field contains a name and a value, separated by a colon. Format Name:Value.

For example:

Content-Type: test/html; charset=utf-8

Content-Length: 78

Blank line

* * A response to the first message is followed by a blank line, which notifies the client that there is no header under the blank line by sending carriage return and newline characters.

Response entity

The response entity is loaded with data to be returned to the client. The data can be text or binary (for example, pictures, videos)

For example:

# this must be a blank line

HTTP request method

In the process of HTTP communication, each HTTP request message contains a HTTP request method, which is used to tell the client to request to the server to perform some specific operations. Here are several commonly used HTTP request methods.

Description of HTTP request method

GET is used by the client to request specified resource information and return the specified resource entity

HEAD is similar to GET, but it does not need the resources of the server to respond to the request, but returns the response header (you only need to respond to the header, that is, tell me whether you have it or not, and you don't need the cache interface to give it to me)

POST submits data to the server based on HTML forms, and the server usually needs to store this data, usually in a relational database such as mysql

PUT, in contrast to GET, sends resources to the server, which usually needs to store this resource (usually in a file system)

DELETE requests the server to delete the resources specified by URL

MOVE requests the server to move the specified page to another network address

OPTIONOS probes the request methods supported by the server for the requested URL

A proxy server, firewall, gateway, etc., that TRACE experiences in the middle of a request.

The commonly used HTTP request methods are GET, POST, HEAD

Status code of HTTP

Status code description

1XX informational status code, which is used to specify certain actions corresponding to the client.

2XX success status code, I request a resource, this resource is in, which means that the request is successful.

The status code of 3XX redirection sometimes returns a new address rather than the result

4XX client class error, the resource you requested does not exist, or when you request, we deny you access to this resource, you do not have permission

Error message for the 5XX server class. When a request is made to the server, the server finds that a script needs to be run to invoke the parsing library. This can happen if something goes wrong during the call. Or there are syntax errors in your script, which may also cause this problem.

Description of common status codes

Status code description

200 the server successfully returned a web page, which is the standard status code returned by a successful HTTP request

201 CREATED displays after uploading the file successfully

The 301 Move Permanently,*** redirect will return a new address and tell us that the address you requested has been moved to that new address

302 Fonud, temporarily redirected, temporarily placed somewhere, using "Location: new location" in the response message

304Not Modified, the resource has not been modified.

403 Forbidden request denied

404 the resource requested by Not Found does not exist

405 Method Not Allowed the method you use is not allowed, not supported

500 Internal Server Error: server internal error

502 Bad Gateway, the proxy server receives a pseudo response from the upstream server; the upper server returns an incomprehensible message, so the proxy server indicates an error.

503 Service Unavailable, the service is temporarily unavailable

Introduction to HTTP

Universal first part

Request header

Response header

Entity header: specifically used to represent the internal type, length, coding format, etc., of resources in an entity.

Extension header: non-standard header, which can be created by programmers

Universal first part

Connection: define options related to request and response between Cramp S

In http1.0, if he wants to use persistent connections, the option he sets is

Connection:keep-alive

Cache-Control: cache control to achieve finer cache control. It is more common on http 1.1.

Request header

Client-IP: client IP address

Host: the requested host, which is useful when implementing a virtual host based on hostname

Referer: indicates the URL that requests the original resource of the current resource. Hotlink protection can be achieved by using referer.

User-Agent: a user agent, generally a browser

Accept header: which types of encodings can be accepted by the client

Accept: the type of media that the server can send

Accetp-Charset: received character set

Accept-Encoding: encoding format

Accept-Lanage: acceptable language coding format

Conditional request header: (only used in http1.1)

When sending a request, ask the other party first whether the condition is met. If the condition is met, the request will be made, and if not, no request will be made.

Security-related requests:

Authorization

Response header

Age: how long can you use after the resource response is given to you?

Server: explain to the client the name and version of the program you are using

The first part of the negotiation class:

Vary: the first list, according to which the server will select the most suitable version and send it to the client

Related to security:

WWW-Authentication

Set-Cookie

Entity header

Location: indicates the new location of the resource, which is usually used when implementing the 302 response code

Allow: the request method allowed for this resource

The first part of the content

Content-Encoding

Content-Language

Content-Length

Content-Location: where the content is located

Content-Type

Cache related:

ETag: extension tags / tags

Expires: expiration time

Last-Modified: delete modification time

Transactions of HTTP

Contains a HTTP request, and the response to the corresponding request is called a http transaction, and it can also be understood that a http transaction is a complete process of HTTP request and HTTP response.

Http protocol by default, each transaction will open and close a new connection, so it will be quite time-consuming and bandwidth-consuming. Due to the slow start feature of TCP, the performance of each new connection will be degraded, so there is a limit to the number of parallel connections that can be opened. So using persistent connections is a little better than not using persistent connections by default, and its benefit is that it takes less time to request and disconnect from tcp.

HTTP resources

Resources are the content that users can request and obtain from the server through the browser or user agent through the HTTP protocol, such as html documents, a picture and so on.

Resource type: is tagged through MIME

Format: major/minor primary and secondary tags

Commonly used MIME types

MIME type file type

Test/htmlhtml, htm text types

Text/plaintext text type

Image/jpegjpeg image type

Image/gifgif image type

Vedio/mpeg4 Audio tag Typ

The marking method of application/vnd.ms-powerpoint dynamic Resources

URI and URL

URI (Uniform Resource Identifier) same resource identifier

A string used to identify the name of an Internet resource that allows your users to interoperate with the resource through a specific protocol. Every resource available on the Web, including HTML documents, images, video clips, programs, and so on, is located by a common resource identifier. So we can use URI to identify the name of each resource

URL (Uniform Resource Locator) (uniform resource locator)

Used to describe the specific location of a resource on a particular server.

For example: http://www.magedu.com:80/download/bash-4.3.1-1.rpm

The format of URL is divided into three parts

Scheme (scheme) (also known as protocol): http://

Internet address: generally, this address refers to the server: www.magedu.com:8080

Resources on a specific server: download/bash-4.3.1-1.rpm

CGI

Common Gateway Interface Universal Gateway Interface

When the web server finds that it needs to execute the script, it deals with the back-end application through the CGI protocol and dynamically delivers the user's request to the server, and the server's result is returned to the http server through the CGI protocol.

Other knowledge that needs to be known

The specific process of a Web resource request

The client enters the address that needs to be accessed in the Web browser

The Web browser requests the DNS server to query the address resolved to the specified domain name and Web server

The client establishes a connection with the requested Web server (TCP three-way handshake)

After the TCP is successfully established, initiate a HTTP request

After the server receives the client HTTP request, it will process the request.

Processing the resources specified by the client

The server builds a response message and responds to the client

The server records this information in the log

How to receive multiple user requests concurrently by http

Because http works under the blocking model by default, it only receives one request at a time, and then receives the next request after processing the request, so it can only be done one by one.

So we want to respond to user requests concurrently and need a multi-process model. The web server itself generates multiple child processes to respond to user requests, that is, when a user request is sent to the Web server, the web main process does not respond directly to the user's request, but generates a child process to respond to the user's request, so that after the child process establishes a connection with the user. The main process of the Web waits for another user's request, and when the second user's request comes in, a child process is generated to respond to the second user's request. and so on. So each user request is processed by a child process.

Connection socket

Client IP,cport ↔ server IP, sport

A main process generates N child processes to respond to user requests, while it is actually the main process that responds to client requests. Connection sockets do not really respond to user requests, but are only used to mark user requests. The Web server does not really establish a connection to port 80, but uses another temporary port. Some people will be surprised that you use a temporary port to respond to me when it is clear that I am requesting port 80. In fact, this temporary port is only used to mark such a client request, not to actually respond to the client request. The real response is that port 80 of the main process responds outward.

Monitoring sockets: only the main service is listening. That is, port 80 is used.

The web server's iCando structure:

Single-process model: respond to only one request at a time

Multi-process model: each process responds to a user request to achieve the effect of concurrency

Reused Icano mechanism: a process generates multiple threads, each thread responding to a user request

Reused Icano mechanism: multiple threads are enabled, but each thread responds to multiple requests

We are using a single thread, not a process

Process reuse (multi-process model)

We know that when the Web server needs to respond to the user's request, it will generate a child process to respond to the user's request, but after the general user request is completed, the Web server needs to destroy the child process. So come and go, we need to constantly create child processes, destroy child processes … Which consumes system resources To solve this problem, we can create a process pool in which some idle child processes are stored, so when the user requests, we can take out an idle child process from the process pool to respond to the user's request. If we return the child process to the process pool after the request is completed, we can save the unnecessary waste of system resources caused by the system creation and destruction of the child process.

And how big is the process pool? It is created based on the resources on your server and the needs of your server users. Creating this process pool also has the advantage of defining how many sub-processes we can use at most, so as not to bring down our servers once a large number of requests come in. This problem can be avoided by having a process pool. When all the subprocesses in our process pool are used up, if there are still requests coming in at this time, then you will have to wait in line outside. Therefore, the use of process pool can also control the amount of concurrent requests.

The concept and calculation of website Traffic Measurement and concurrency

IP (Internet Protocol) refers to an independent IP address, which is used to measure an important indicator of website traffic. When the client visits the site using a separate IP address, it will be recorded, and the total number of records is a measure. In general, visits to the website with the same IP address will only be recorded once a day.

However, using a separate IP address to measure the number of visits to a website will have a disadvantage, that is, we know the relationship between ADSL and NAT, so the total number of IP obtained will not match the actual visit situation.

The number of visits to PV (Page View) pages, which usually measures the main indicators of an online news channel and website or even a piece of online news. The number of page views is one of the most commonly used indicators to evaluate website traffic. Regardless of whether the client is different or the IP is different, as long as you use the browser to make a request to the server (page views and clicks), the server will respond to the request when it receives the request, and these will be recorded in the PV.

So the number of PV generally reflects the number of pages visited, but there is also a certain drawback, that is, refreshed pages will also be counted in PV, so the number of PV is not the real number of page visitors, because a visitor can generate multiple PV.

UV (Unique Visitor) website independent visitors, the same client visiting the site will be regarded as a unified independent visitor. Using the same client to visit the same website within a day will only calculate UV once.

There is a disadvantage in using UV to calculate, that is, in schools, for example, a client computing may be used by multiple people, which will result in numerical errors.

Concurrent connection

The number of connections that the website server can handle per unit time

Calculation of IP, PV, UV and concurrency

Calculation of IP

1. Analyze the access log of the website and remove the same IP address

two。 Use third-party statistical tools

3. Add one more program code statistics field after the web page, and then use the log analysis tool to count the program code fields.

Calculation of PV

1. Analyze the visit log of the website and calculate the number of pages such as HTML and dynamic language

two。 Use third-party statistical tools

3. Add one more program code statistics field after the web page, and then use the log analysis tool to count the program code fields.

Calculation of UV

1. Analyze the HTTP request message of the client and record the unique information of the client for analysis. If the common characteristics are met, it will be considered the same client, then it will be recorded as a UV at this time.

two。 Through cookie

When a client visits a website, the server will send a Cookie,Cookie to the client which is unique, so when the client visits the website again using cookie, it will come with this Cookie, then the server will think it is the same client, and the UV will only be recorded once

Cons: using the Cookie method is more accurate than analyzing the client-side HTTP request header information, but there is a drawback that the user may turn off the Cookie feature. Or automatically delete operations such as cookie, so the indicators obtained can not be said to be completely accurate.

Calculation of concurrency

Requests per second (throughput) + concurrent browsing connections + average user consideration time = total number of concurrent users of the site

On how the basic application of http is shared here, I hope that the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.