In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/01 Report--
This article introduces you how the basic application of http is, the content is very detailed, interested friends can use it for reference, I hope it can be helpful to you.
Introduction of http Protocol
Http:Hyper Text Transfer Protocol Hypertext transfer Protocol is one of the most widely used network protocols on the Internet, which is mainly used for Web services. The text information is processed by computer and the format is HTML (Hyper Text Mark Language) hypertext markup language.
Version of the http protocol
Http 0.9: transfer html documents only to users
Http 1.0
MIME (Multipurpose Internet Mail Extesions) mechanism is introduced: multi-purpose Internet mail extension. After introducing this technology, http can send multimedia (such as video, audio, etc.) messages. This mechanism allows http not only to support html format, but also to support other formats for sending.
The keep-alive mechanism is introduced to support the function of persistent connections (but this keep-alive principle is formed by adding a field in the beginning, not natively).
Introduction of caching support
Http 1.1
Support for more request methods, more fine-grained cache control, and native support for persistent connections (presistent).
Http 2.0
Provides HTTP semantically optimized transport
Spdy: a technology introduced by google to accelerate the exchange of http data, especially using the ssl acceleration mechanism, but spdy doesn't use much yet.
At present, the commonly used versions are http 1.0 and http 1.1.
Html text introduction
Html text schema
TITLE H1
H2
ToGoogle
The way html documents are generated
Static state
Edited and defined in advance
Dynamic
Output the result in html format after compiling the program written in the language
Dynamic languages are: php,jsp,asp,.net
Note: these scripts must have corresponding interpreters, for example, php needs to have a php interpreter and so on.
Static and dynamic approach
Static state
1. The Web server registers socket with the kernel
2. The client initiates a request request to the Web server through the browser
3. The Web server receives the request information from the client
4. If the resource requested by the user is local to the server, the http service will apply for a call from the system kernel
5. The kernel calls the data on the local disk and sends the data to the http service
6. Http sends the resources requested by the user through the response message, and finally responds to the client
Dynamic
Different from the static, if the user requests dynamic content, then the http service will call the back-end parser, and the dynamic language will process the user's request. If you need to request data, it will apply to the kernel to call and obtain the user-specified data from the disk. Run through the interpreter, the running result will usually generate a file in html format. Then the response message is built and finally sent back to the client.
Http protocol
Http protocol message
There are many lines in the HTTP message, which are generally composed of ASCII code strings, and the length of each field is uncertain. HTTP messages can be divided into two types: request message and response message.
Request Message (request message)
Client-→ server
The client sends a request to the server, and different websites are used to request different resources (html documents)
Response Message (response message)
Server-→ client
It is the server that responds to the client's request
Introduction to the format of request message
Request line + request header + blank line + request entity
What is the method of this request, that is, the method of request
Which resource is requested and which URL. Can be a relative path, such as / images/log.jpg, or an absolute path, such as http://www.magedu.com/images.banner.jpg
What is the requested protocol version, http protocol version, format HTTP/., for example: HTTP/1.0,HTTP/1.1 header, there may be more than one header. All kinds of first information that can be used.
Request entity, what exactly is the content of your request?
Request line
It is composed of request method field + request URL field + HTTP protocol version, which is used to identify the request method, the requested resource, and the requested protocol version, which are directly separated by "spaces"!
Request header
It is composed of keyword + keyword value, separated by ":" and formatted as Name:Value. The function of the request header is to inform the server side of the request content through the client, and there can be more than one header.
Blank line
There will be a blank line after the request header, which is used to inform the server that the request header information will not appear again by sending carriage return characters and newline characters.
Request entity
What exactly do you need to ask for?
For example:
# this must be a blank line
Introduction to the format of response message
Start line + response header + blank line + response entity
The server needs to respond to what version is requested by the client when responding.
What is the status code of the request? 202403, etc.
What is the information of the status code of the response, the reason phrase, the meaning of the response of the status code, readable information
A lot of response heads.
Response body
Start line
Also known as the status line, it is used by the server to respond to client requests. It consists of version number, status code and reason phrase, such as "HTTP/1.1 200 OK"
Response header
Similar to a request message, there are usually several header fields after the starting line. Each header field contains a name and a value, separated by a colon. Format Name:Value.
For example:
Content-Type: test/html; charset=utf-8
Content-Length: 78
Blank line
* * A response to the first message is followed by a blank line, which notifies the client that there is no header under the blank line by sending carriage return and newline characters.
Response entity
The response entity is loaded with data to be returned to the client. The data can be text or binary (for example, pictures, videos)
For example:
# this must be a blank line
HTTP request method
In the process of HTTP communication, each HTTP request message contains a HTTP request method, which is used to tell the client to request to the server to perform some specific operations. Here are several commonly used HTTP request methods.
Description of HTTP request method
GET is used by the client to request specified resource information and return the specified resource entity
HEAD is similar to GET, but it does not need the resources of the server to respond to the request, but returns the response header (you only need to respond to the header, that is, tell me whether you have it or not, and you don't need the cache interface to give it to me)
POST submits data to the server based on HTML forms, and the server usually needs to store this data, usually in a relational database such as mysql
PUT, in contrast to GET, sends resources to the server, which usually needs to store this resource (usually in a file system)
DELETE requests the server to delete the resources specified by URL
MOVE requests the server to move the specified page to another network address
OPTIONOS probes the request methods supported by the server for the requested URL
A proxy server, firewall, gateway, etc., that TRACE experiences in the middle of a request.
The commonly used HTTP request methods are GET, POST, HEAD
Status code of HTTP
Status code description
1XX informational status code, which is used to specify certain actions corresponding to the client.
2XX success status code, I request a resource, this resource is in, which means that the request is successful.
The status code of 3XX redirection sometimes returns a new address rather than the result
4XX client class error, the resource you requested does not exist, or when you request, we deny you access to this resource, you do not have permission
Error message for the 5XX server class. When a request is made to the server, the server finds that a script needs to be run to invoke the parsing library. This can happen if something goes wrong during the call. Or there are syntax errors in your script, which may also cause this problem.
Description of common status codes
Status code description
200 the server successfully returned a web page, which is the standard status code returned by a successful HTTP request
201 CREATED displays after uploading the file successfully
The 301 Move Permanently,*** redirect will return a new address and tell us that the address you requested has been moved to that new address
302 Fonud, temporarily redirected, temporarily placed somewhere, using "Location: new location" in the response message
304Not Modified, the resource has not been modified.
403 Forbidden request denied
404 the resource requested by Not Found does not exist
405 Method Not Allowed the method you use is not allowed, not supported
500 Internal Server Error: server internal error
502 Bad Gateway, the proxy server receives a pseudo response from the upstream server; the upper server returns an incomprehensible message, so the proxy server indicates an error.
503 Service Unavailable, the service is temporarily unavailable
Introduction to HTTP
Universal first part
Request header
Response header
Entity header: specifically used to represent the internal type, length, coding format, etc., of resources in an entity.
Extension header: non-standard header, which can be created by programmers
Universal first part
Connection: define options related to request and response between Cramp S
In http1.0, if he wants to use persistent connections, the option he sets is
Connection:keep-alive
Cache-Control: cache control to achieve finer cache control. It is more common on http 1.1.
Request header
Client-IP: client IP address
Host: the requested host, which is useful when implementing a virtual host based on hostname
Referer: indicates the URL that requests the original resource of the current resource. Hotlink protection can be achieved by using referer.
User-Agent: a user agent, generally a browser
Accept header: which types of encodings can be accepted by the client
Accept: the type of media that the server can send
Accetp-Charset: received character set
Accept-Encoding: encoding format
Accept-Lanage: acceptable language coding format
Conditional request header: (only used in http1.1)
When sending a request, ask the other party first whether the condition is met. If the condition is met, the request will be made, and if not, no request will be made.
Security-related requests:
Authorization
Cookie
Response header
Age: how long can you use after the resource response is given to you?
Server: explain to the client the name and version of the program you are using
The first part of the negotiation class:
Vary: the first list, according to which the server will select the most suitable version and send it to the client
Related to security:
WWW-Authentication
Set-Cookie
Entity header
Location: indicates the new location of the resource, which is usually used when implementing the 302 response code
Allow: the request method allowed for this resource
The first part of the content
Content-Encoding
Content-Language
Content-Length
Content-Location: where the content is located
Content-Type
Cache related:
ETag: extension tags / tags
Expires: expiration time
Last-Modified: delete modification time
Transactions of HTTP
Contains a HTTP request, and the response to the corresponding request is called a http transaction, and it can also be understood that a http transaction is a complete process of HTTP request and HTTP response.
Http protocol by default, each transaction will open and close a new connection, so it will be quite time-consuming and bandwidth-consuming. Due to the slow start feature of TCP, the performance of each new connection will be degraded, so there is a limit to the number of parallel connections that can be opened. So using persistent connections is a little better than not using persistent connections by default, and its benefit is that it takes less time to request and disconnect from tcp.
HTTP resources
Resources are the content that users can request and obtain from the server through the browser or user agent through the HTTP protocol, such as html documents, a picture and so on.
Resource type: is tagged through MIME
Format: major/minor primary and secondary tags
Commonly used MIME types
MIME type file type
Test/htmlhtml, htm text types
Text/plaintext text type
Image/jpegjpeg image type
Image/gifgif image type
Vedio/mpeg4 Audio tag Typ
The marking method of application/vnd.ms-powerpoint dynamic Resources
URI and URL
URI (Uniform Resource Identifier) same resource identifier
A string used to identify the name of an Internet resource that allows your users to interoperate with the resource through a specific protocol. Every resource available on the Web, including HTML documents, images, video clips, programs, and so on, is located by a common resource identifier. So we can use URI to identify the name of each resource
URL (Uniform Resource Locator) (uniform resource locator)
Used to describe the specific location of a resource on a particular server.
For example: http://www.magedu.com:80/download/bash-4.3.1-1.rpm
The format of URL is divided into three parts
Scheme (scheme) (also known as protocol): http://
Internet address: generally, this address refers to the server: www.magedu.com:8080
Resources on a specific server: download/bash-4.3.1-1.rpm
CGI
Common Gateway Interface Universal Gateway Interface
When the web server finds that it needs to execute the script, it deals with the back-end application through the CGI protocol and dynamically delivers the user's request to the server, and the server's result is returned to the http server through the CGI protocol.
Other knowledge that needs to be known
The specific process of a Web resource request
The client enters the address that needs to be accessed in the Web browser
The Web browser requests the DNS server to query the address resolved to the specified domain name and Web server
The client establishes a connection with the requested Web server (TCP three-way handshake)
After the TCP is successfully established, initiate a HTTP request
After the server receives the client HTTP request, it will process the request.
Processing the resources specified by the client
The server builds a response message and responds to the client
The server records this information in the log
How to receive multiple user requests concurrently by http
Because http works under the blocking model by default, it only receives one request at a time, and then receives the next request after processing the request, so it can only be done one by one.
So we want to respond to user requests concurrently and need a multi-process model. The web server itself generates multiple child processes to respond to user requests, that is, when a user request is sent to the Web server, the web main process does not respond directly to the user's request, but generates a child process to respond to the user's request, so that after the child process establishes a connection with the user. The main process of the Web waits for another user's request, and when the second user's request comes in, a child process is generated to respond to the second user's request. and so on. So each user request is processed by a child process.
Connection socket
Client IP,cport ↔ server IP, sport
A main process generates N child processes to respond to user requests, while it is actually the main process that responds to client requests. Connection sockets do not really respond to user requests, but are only used to mark user requests. The Web server does not really establish a connection to port 80, but uses another temporary port. Some people will be surprised that you use a temporary port to respond to me when it is clear that I am requesting port 80. In fact, this temporary port is only used to mark such a client request, not to actually respond to the client request. The real response is that port 80 of the main process responds outward.
Monitoring sockets: only the main service is listening. That is, port 80 is used.
The web server's iCando structure:
Single-process model: respond to only one request at a time
Multi-process model: each process responds to a user request to achieve the effect of concurrency
Reused Icano mechanism: a process generates multiple threads, each thread responding to a user request
Reused Icano mechanism: multiple threads are enabled, but each thread responds to multiple requests
We are using a single thread, not a process
Process reuse (multi-process model)
We know that when the Web server needs to respond to the user's request, it will generate a child process to respond to the user's request, but after the general user request is completed, the Web server needs to destroy the child process. So come and go, we need to constantly create child processes, destroy child processes … Which consumes system resources To solve this problem, we can create a process pool in which some idle child processes are stored, so when the user requests, we can take out an idle child process from the process pool to respond to the user's request. If we return the child process to the process pool after the request is completed, we can save the unnecessary waste of system resources caused by the system creation and destruction of the child process.
And how big is the process pool? It is created based on the resources on your server and the needs of your server users. Creating this process pool also has the advantage of defining how many sub-processes we can use at most, so as not to bring down our servers once a large number of requests come in. This problem can be avoided by having a process pool. When all the subprocesses in our process pool are used up, if there are still requests coming in at this time, then you will have to wait in line outside. Therefore, the use of process pool can also control the amount of concurrent requests.
The concept and calculation of website Traffic Measurement and concurrency
IP
IP (Internet Protocol) refers to an independent IP address, which is used to measure an important indicator of website traffic. When the client visits the site using a separate IP address, it will be recorded, and the total number of records is a measure. In general, visits to the website with the same IP address will only be recorded once a day.
However, using a separate IP address to measure the number of visits to a website will have a disadvantage, that is, we know the relationship between ADSL and NAT, so the total number of IP obtained will not match the actual visit situation.
PV
The number of visits to PV (Page View) pages, which usually measures the main indicators of an online news channel and website or even a piece of online news. The number of page views is one of the most commonly used indicators to evaluate website traffic. Regardless of whether the client is different or the IP is different, as long as you use the browser to make a request to the server (page views and clicks), the server will respond to the request when it receives the request, and these will be recorded in the PV.
So the number of PV generally reflects the number of pages visited, but there is also a certain drawback, that is, refreshed pages will also be counted in PV, so the number of PV is not the real number of page visitors, because a visitor can generate multiple PV.
UV
UV (Unique Visitor) website independent visitors, the same client visiting the site will be regarded as a unified independent visitor. Using the same client to visit the same website within a day will only calculate UV once.
There is a disadvantage in using UV to calculate, that is, in schools, for example, a client computing may be used by multiple people, which will result in numerical errors.
Concurrent connection
The number of connections that the website server can handle per unit time
Calculation of IP, PV, UV and concurrency
Calculation of IP
1. Analyze the access log of the website and remove the same IP address
two。 Use third-party statistical tools
3. Add one more program code statistics field after the web page, and then use the log analysis tool to count the program code fields.
Calculation of PV
1. Analyze the visit log of the website and calculate the number of pages such as HTML and dynamic language
two。 Use third-party statistical tools
3. Add one more program code statistics field after the web page, and then use the log analysis tool to count the program code fields.
Calculation of UV
1. Analyze the HTTP request message of the client and record the unique information of the client for analysis. If the common characteristics are met, it will be considered the same client, then it will be recorded as a UV at this time.
two。 Through cookie
When a client visits a website, the server will send a Cookie,Cookie to the client which is unique, so when the client visits the website again using cookie, it will come with this Cookie, then the server will think it is the same client, and the UV will only be recorded once
Cons: using the Cookie method is more accurate than analyzing the client-side HTTP request header information, but there is a drawback that the user may turn off the Cookie feature. Or automatically delete operations such as cookie, so the indicators obtained can not be said to be completely accurate.
Calculation of concurrency
Requests per second (throughput) + concurrent browsing connections + average user consideration time = total number of concurrent users of the site
On how the basic application of http is shared here, I hope that the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.