In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-20 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/02 Report--
12.1 introduction to http Protocol
Http (HyperText Transfer Protocol, Hypertext transfer Protocol) is the most widely used network protocol on the Internet. All www files must comply with this standard. HTTP was originally designed to provide a way to publish and receive HTML pages. In 1960, American Ted Nelson conceived a method of processing text information by computer, and called it hypertext (hypertext). This has become the foundation of the development of the standard architecture of HTTP hypertext transfer protocol.
Hypertext is text with hyperlinks, and hyperlinks are text that jumps between documents based on some links.
Http protocol is a stateless protocol (stateless):
The server can not continuously track the source of visitors. In order to solve this problem, cookie and session are introduced to track and save the behavior of users.
12.2 http Technical Architecture
Http is a standard (TCP) for client-side and server-side requests and responses. The client is the end user and the server is the website.
By using a web browser, web crawler, or other tool, the client initiates an HTTP request to the designated port on the server (the default port is 80).
This client is called a user agent (User Agent).
Resources, such as HTML files and images, are stored on the answering server, which is called the source server (Origin Server).
There may be multiple intermediate layers between the user agent and the source server, such as agents, gateways, or tunnels. Although TCP/IP is the most popular application on the Internet, HTTP does not stipulate that it must be used and based on the layers it supports. In fact, HTTP can be implemented on any other Internet protocol or on any other network. HTTP only assumes that its underlying protocols provide reliable transmission, and any protocol that can provide such a guarantee can be used by it.
Typically, the HTTP client initiates a request to establish an TCP connection to the specified port of the server. The HTTP server listens for requests sent by the client on that port. Once a request is received, the server sends back a status line to the client, such as "HTTP/1.1 200 OK" and a response message, the message community of which may be the requested file, error message, or other information. The reason HTTP uses TCP instead of UDP is that you have to transfer a lot of data to open a web page, while the TCP protocol provides transmission control, sequential organization of data, and error correction.
Resources requested through the HTTP or HTTPS protocol are identified by the uniform Resource Identifier (Uniform Resource Identifiers).
12.3 http protocol function
The http protocol is a transport protocol used to transfer hypertext from a www server to a local browser. It can make browsers more efficient and reduce network transmission. It not only ensures that the computer transmits the hypertext document correctly and quickly, but also determines which part of the transmission document and which part of the content is displayed first (for example, text precedes graphics) and so on.
Http is the application layer communication protocol between client browsers or other programs and web servers. The hypertext information is stored on the web server on internet, and the client needs to transmit the hypertext information to be accessed through the http protocol. Http contains commands and transmission information, which can be used not only for web access, but also for communication between other Internet / intranet application systems, so as to realize the integration of hypermedia access to all kinds of application resources.
The address of the website we entered in the browser's address bar is called URL (Uniform Resource Locator, uniform Resource Locator). Just like every home has a house address, every web page has an internet address. When you type a URL in the browser's address box or click a hyperlink, URL determines the address of Lin browsing. The browser uses HTTP to extract the web page code of the site on the web server and translate it into beautiful web pages.
12.4 version of http protocol
Hypertext transfer protocols have evolved many versions, most of which are backward compatible. The use of HTTP version numbers is described in RFC 2145. The client tells the server the protocol version number it uses at the beginning of the request, while the latter uses the same or earlier protocol version in the response.
The main versions of the http protocol are as follows:
HTTP/0.9: the most primitive version, with simple functions. Only GET is accepted as a request method, no version number is specified in the communication, and the request header is not supported. Because the POST method is not supported in this version, the client cannot pass much information to the server.
HTTP/1.0: this is the first version of the HTTP protocol to specify a version number in a communication, and it is still widely used today, especially in proxy servers. MIME is supported.
HTTP/1.1: added cache function, introduced long connection (default), can work well with proxy server, and supports sending multiple requests at the same time in management mode, in order to reduce line load and improve transmission speed.
HTTP/2.0: significantly improves web performance and reduces network latency, commonly used in https
The differences between HTTP/1.1 and HTTP/1.0 protocol are mainly reflected in:
A) caching processing
B) bandwidth optimization and use of network connections
C) Management of error notifications
D) delivery of messages on the network
E) maintenance of Internet addresses
F) Security and integrity
12.5 explanation of nouns
HTML:HyperText Mark Language, Hypertext markup language
URI:Uniform Resource Indentifier, uniform resource identifier. Used to define the global scope (including but not limited to the Internet) to mark a unique way to locate a resource access path, or naming, is called a uniform resource identifier. The unity here refers to the unity of the path format.
URL:Uniform Resource Location, uniform resource locator, is a subset of URI and is used to describe the uniform representation format (protocol://host:port/path/to/file) of Internet resources on the Internet.
Basic URL syntax:
: /: @: /;? #
Params: parameter, such as http://www.idfsoft.com/bbs/index.html;gender=f, where gender=f is a parameter
Query: a specific behavior passed to a relational database page. For example, http://www.idfsoft.com/bbs/item.php?username=tom&title=abc, this URL indicates that the entry of username=name and title=abc is to be queried.
Frag: used to define a location in a larger page, not the beginning of the page. To put it bluntly, it is position anchoring.
URN:Uniform Resource Naming, a uniform resource nominator, is also a subset of URI
MIME:Multipurpose Internet Mail Extension, a multi-purpose Internet mail extension.
MIME can re-encode non-text data into text format before transmission and then transmit it to the other party. The receiver can restore it to the original format in the opposite way, and can also call the appropriate program to open the file.
Http transaction: the process of one request (request) and response (response) of http protocol is called http transaction.
Dynamic web pages: contains static and dynamic content (dynamic content needs to be executed)
What is stored on the server side is not a HTML document, but a script developed by the programming language. After accepting the parameters, the script runs once on the server side. After the run is completed, the HTML format document is generated, and the generated HTML document is transmitted to the client.
Web resource: web resource.
Static files: .jpg, .gif, .html, .txt, .js, .css, .mp3, .avi
Dynamic files: .php, .jsp
PV:Page View, how many pages have been opened
UV:User View, independent IP quantity
12.6 http protocol message
The http protocol adopts the request / response model. The client sends a request to the server, and the request header contains the requested method, URL, protocol version, and a MIME-like message structure containing request modifiers, customer information, and content. The server responds with a status line, including the version of the message protocol, success or error coding, plus server information, entity meta-information, and possible entity content.
There are two kinds of messages in http protocol: request message and response message. The syntax style is as follows:
Request message syntax:
Response message syntax:
The first line of a message is usually called the "start line" of the message, and the content of the subsequent label format is called the Header Field, each of which consists of a name (name) and a value (value), separated by a comma.
In addition, the response message usually has an information body called Body, that is, the content of the response to the client.
Method: request method that indicates the actions that the client wants the server to perform on the resource. The following are common:
GET: get a resource from the server
HEAD: only the response header of the document is obtained from the server, and the response content is not sent. Using HEAD is very efficient when we only need to see the status of a page
POST: sends data to the server for processing. The server usually provides a form, and when the client fills in the data, it will put the content into the entity-body and submit it to the server.
PUT: stores the body of the request on the server. To put it bluntly, it is to upload data.
DELETE: request to delete the specified document on the server
TRACE: the proxy server that tracks the request to the middle of the server
OPTIONS: request the server to return the request method used for the specified resource support
The protocol version of version:http, such as HTTP/.
Status: response status code, which is used to mark what happens during request processing. Common response status codes are as follows:
1xx:100-101pure informational hint
100: the server received only part of the request, but once the server did not reject the request, the client should continue to send the rest of the request with the response status code "Continue"
Server conversion protocol in which the server converts to another protocol in compliance with the client's request, and the response status code is "Switching Protocols"
2XX:200-206Info for the "success" class
200: the requested resource is normal. All requested data is sent through the entity-body portion of the response message with the response status code "OK"
The request is created and the new resource is created with the response status code "Created"
The request for processing has been accepted, but the processing is not completed. The response status code is "Accepted".
The document has been returned normally, but some response headers may be incorrect because a copy of the document is used and the response status code is "Non-authoritative information"
204: there are no new documents. The browser should continue to display the original document. Response status code is "No Content"
205: there are no new documents. But the browser should reset what it displays. Used to force the browser to clear the form input, and the response status code is "Reset Content"
The client sent a GET request with a Range header, and the server completed it
3XX:300-305, information for the "redirect" class
Permanent redirection, response status code is "Moved Permanently"
The resource pointed to by the requested URL has been deleted, but the new location of the resource is indicated by the first Location in the response message, and the client needs to request the resource in the new location.
Temporary redirect, I am busy here, the resources you want are also available in another place, you go there first, the response status code is "Found"
Similar to 301, but indicates the temporary new location of the resource through Location in the response message
304: the client issued a conditional request, but the server found that the resource requested by the client had been cached by the client and remained unchanged, and asked the client to retrieve it directly from the cache. Response status code is "Not Modified"
4XX:400-415, information for the "client error" class
The response status code is "Bad Request" because the client request has a syntax error and cannot be understood by the server.
401: account number and password authentication are required to access resources. The response status code is "Unauthorized".
Request is disabled and the response status code is "Forbidden"
The server cannot find the resource requested by the client, and the response status code is "Not Found".
5XX:500-505, information for the "server error" class
500: server internal error, response status code is "Internal Server Error"
502: the proxy server received a pseudo response from the back-end server with the response status code "Bad Gateway"
503: the server is currently unable to process client requests. After a period of time, the response status code is "Service".
Reason-phrass: explain the status status code, you succeeded, what succeeded, what failed, what failed, whether to get the file success / failure or upload file success / failure, etc.
Headers: an attribute used to mark a request or response
Each request or response message may contain any header
Each header has a first name, followed by a colon, followed by an optional space, followed by a value
Format: Name: Value
Classification of the first part:
General header: can be used in request message and response message, the common contents are as follows:
Date: the creation time of the message
Connection: connection status, such as keep-alive,close, etc.
Via: the intermediate node through which the message is displayed
Cache-Control: the effective method and mechanism of controlling caching
Request header: can only be used in the request message. The common contents are as follows:
Accept: notifies the server of the types of media acceptable to the client
Accept-Charset: notifies the server of the character set acceptable to the client
Accept-Encoding: notifies the server of content encoding formats acceptable to the client, such as gzip
Accept-Language: a language acceptable to the notification server client
Client-IP: IP of the client
Host: server name and port number of the request
Referer: the resource that contains the resource currently being requested at the next level
User-Agent: client agent
Conditional request header:
Expect: what message do you expect from the server?
If-Modified-Since: whether the requested resource has been modified since the time specified here
If-Unmodified-Since: whether the requested resource has not been modified since the time specified here
If-None-Match: whether the ETag tag of the document stored in the local cache does not match the ETag of the server document
If-Match: whether the Etag of the document stored in the local cache matches the Etag of the server document
Security request header:
Authorization: sends authentication information such as account number and password to the server
Cookie/Cookie2: the client sends cookie to the server
Proxy request header:
Proxy-Authorization: authenticating to a proxy server
Response header: can only be used in response messages
Informative:
Age: response duration
Server: server program software name and version
Negotiation header: used when a resource has multiple representations
Accept-Ranges: the type of request scope acceptable to the server
Vary: list of other headers viewed by the server
Security response header:
Set-Cookie: setting cookie to the client
Set-Cookie2: setting cookie2 to the client
WWW-Authenticate: challenge authentication form to client from server
Entity header: information that identifies the entity
Allow: lists the request methods available for this entity
Location: tells the client where the real entity is located
Content-Encoding: the encoding format of the content
Content-Language: the language used by the content
Content-Length: the length of the body
Content-Location: the real location of the entity
Content-Type: the object type of the principal
Cache related:
ETag: the extension tag of the entity
Expires: the expiration time of the entity
Last-Modified: time when it was last modified
Extended header
Entity-body: data attached at the time of request or response, possibly empty
Example of a request message:
GET / HTTP/1.1HOST:www.baidu.comConnection:keep-alive
Example of response message:
HTTP/1.1 200 OKX-Powered-By:PHP/5.2.17Vary:Accept-Encoding,Cookie,User-AgentCache-Control:max-age=3,must-revalidateContent-Encoding:gzipContent-Length:6931
12.7 http perimeter
Common tools for protocol review and analysis:
Tcpdump
Tshark
Wireshark
Common http server programs:
Httpd (apache)
Nginx
Lighttpd
Application server: can handle dynamic files
IIS
Tomcat,jetty,jboss,resin
Webshpere,weblogic,oc4j
Common http stress testing tools:
Ab:
Syntax: ab [options] URL
-n: total number of requests
-c: number of concurrency of simulation
-k: test in persistent connection mode
Webbench
Http_load
Jmeter
Loadrunner
Tcpcopy
Ulimit-n #: adjusts the number of files that the current user can open at the same time
Web server resource path mapping method:
Docroot
Alias
Virtual host docroot
User home directory docroot
Concurrent access response model (Web Ithumb O): it is assumed that there is only one thread in each process
Single-process Icano structure: starts a process to process requests, and processes only one at a time, and multiple requests are responded to serially
Multi-process Icano structure: start multiple processes in parallel, each responding to a request
Reuse the Imap O structure: a process responds to multiple requests
Multithreading model: a process generates multiple threads, and each thread responds to a user request
Event driven
Multiplexed multi-process Istroke O structure: start multiple (m) processes, each responding to n requests
12.8 https
Https is actually the result of applying ssl or tls to the http protocol, and https listens to the tcp/443 port
The simplified process for a ssl session is as follows:
(1) the client sends an optional encryption method and requests a certificate from the server
(2) the server sends the certificate and the selected encryption method to the client.
(3) the client obtains the certificate and verifies the certificate.
If you trust the CA to which the certificate is issued:
A) verify the validity of the certificate source: decrypt the digital signature on the certificate with CA's public key
B) verify the legitimacy of the contents of the certificate: integrity verification
C) check the validity period of the certificate
D) check whether the certificate has been revoked
E) the name of the owner in the certificate, which should be consistent with the target host visited
(4) the client generates a temporary session key (symmetric key) and encrypts the data to the server using the server's public key to complete the key exchange.
(5) the server encrypts the resources requested by the user with the key and responds to the client.
Note: SSL sessions are created based on IP addresses, so only one https virtual host can be used on a single IP host
The main operations of the WEB server:
Establish a connection-accept or reject client connection requests
Receive request-read HTTP request message through the network
Process the request-parse the request message and act accordingly
Access to resources-access to the corresponding resources in the request message
Build response-use the correct header to generate HTTP response message
Send response-send the generated response message to the client
Logging-when completed HTTP transactions are recorded in a log file
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.