In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Network Security >
Share
Shulou(Shulou.com)06/01 Report--
Http1. Basic concepts
1.1 introduction
HTTP is the abbreviation of Hyper Text Transfer Protocol (Hypertext transfer Protocol). Its development is the result of a collaboration between the World wide Web Association (World Wide Web Consortium) and the Internet working group IETF (Internet Engineering Task Force), which eventually released a series of RFC,RFC 1945 defined HTTP/1.0 versions. One of the most famous is RFC 2616. RFC 2616 defines a version that is commonly used today-- HTTP 1.1.
HTTP protocol (HyperText Transfer Protocol, Hypertext transfer Protocol) is a transport protocol used to transfer hypertext from a WWW server to a local browser. It can make browsers more efficient and reduce network transmission. It not only ensures that the computer transmits hypertext documents correctly and quickly, but also determines which part of the transferred document and which part of the content is displayed first (for example, text precedes graphics) and so on.
HTTP is an application layer protocol, which consists of requests and responses, and is a standard client-server model. HTTP is a stateless protocol.
1.2 position in the TCP/IP protocol stack
The HTTP protocol is usually carried on the TCP protocol, and sometimes on the TLS or SSL protocol layer. At this time, it becomes what we often call HTTPS. As shown in the following figure:
The port number of the default HTTP is 8010 HTTPS, and the port number of HTTPS is 443.
1.3 request response model for HTTP
In the HTTP protocol, the client always initiates the request and the server sends back the response. See the following figure:
This limits the use of the HTTP protocol, and it is impossible for the server to push the message to the client when the client does not initiate a request.
The HTTP protocol is a stateless protocol, and there is no corresponding relationship between this request and the last request from the same client.
1.4 Workflow
A HTTP operation is called a transaction, and its working process can be divided into four steps:
1) first, a connection needs to be established between the client and the server. Just click on a hyperlink and HTTP's work begins.
2) after the connection is established, the client sends a request to the server in the format of uniform Resource Identifier (URL), protocol version number, followed by MIME information including request modifiers, client information and possible content.
3) after receiving the request, the server gives the corresponding response information in the format of a status line, including the protocol version number of the information, a success or error code, followed by MIME information, including server information, entity information and possible content.
4) the information returned by the client receiving server is displayed on the user's screen through the browser, and then the client is disconnected from the server.
If an error occurs at one of the above steps, the information that produced the error will be returned to the client with a display output. For users, these processes are done by HTTP himself, and users only need to click with the mouse and wait for the information to be displayed.
1.5 use Wireshark to grab TCP and http packages
Open Wireshark and select "Capture"-> "Options" on the toolbar. The interface selection is shown in figure 1:
Figure 1 setting Capture options
General readers only need to select the top drop-down box, select the appropriate Device, and then click "Capture Filter", here the choice is "HTTP TCP port (80)", and then click on the "Start" above to start grabbing the package.
Figure 2 Select Capture Filter
For example, open http://image.baidu.com/ in a browser and grab the package as shown in figure 3:
Fig. 3 grab the bag
In the figure above, you can clearly see the interaction between the client browser (ip 192.168.2.33) and the server:
1) No1: the browser (192.168.2.33) sends a connection request to the server (220.181.50.118). This is the first step of the TCP three-way handshake. As you can see from the figure, it is SYN,seq:X (xshake 0).
2) No2: the server (220.181.50.118) responded to the request from the browser (192.168.2.33) and asked for confirmation at this time: SYN,ACK, where seq:y (y is 0) and ACK:x+1 (1). This is the second step in the three-way handshake.
3) No3: the browser (192.168.2.33) responded to the confirmation from the server (220.181.50.118) and the connection was successful. Is: ACK, where seq:x+1 (1) and ACK:y+1 (1). This is the third step in the three-way handshake.
4) No4: the browser (192.168.2.33) issues a page HTTP request
5) No5: server (220.181.50.118) confirm
6) No6: server (220.181.50.118) sends data
7) No7: client browser (192.168.2.33) confirms
8) No14: the client (192.168.2.33) issues a picture HTTP request
9) No15: server (220.181.50.118) sends status response code 200 OK
……
1.6 head domain
Each header field consists of a domain name, a colon (:), and a domain value. The domain name is case-independent, any number of spaces can be added before the field value, and the header field can be expanded to multiple lines, using at least one space or tab at the beginning of each line.
In the image of capturing the package, click No14 to see as shown in figure 4:
Figure 4 http request message
The response message is shown in figure 5:
Figure 5 http status response message
1.6.1 host header domain
The Host header domain specifies the Intenet host and port number on which the resource is requested, and must indicate the location of the original server or gateway requesting the url. The HTTP/1.1 request must contain the host header domain, otherwise the system will return with a 400 status code.
The host behavior in figure 5:
1.6.2 Referer header domain
The Referer header domain allows the client to specify the source resource address of the request uri, which allows the server to generate a fallback list, which can be used to log in, optimize cache, and so on. He also allows abolished or incorrect connections to be tracked for the purpose of maintenance. If the requested uri does not have its own uri address, the Referer cannot be sent. If you specify a partial uri address, this address should be a relative address.
In figure 4, the content of the Referer line is:
1.6.3 User-Agent header domain
The content of the User-Agent header domain contains the information of the user who made the request.
In figure 4, the content of the User-Agent line is:
1.6.4 Cache-Control header domain
Cache-Control specifies the caching mechanism that requests and responses follow. Setting Cache-Control in a request message or response message does not modify the caching process during another message processing. The cache instructions at the time of the request include no-cache, no-store, max-age, max-stale, min-fresh, only-if-cached, and the instructions in the response message include public, private, no-cache, no-store, no-transform, must-revalidate, proxy-revalidate, max-age.
The header field in figure 5 is:
1.6.5 Date header domain
The Date header domain represents the time when the message was sent, and the description format of the time is defined by rfc822. For example, Date:Mon,31Dec200104:25:57GMT. The time described by Date represents the world standard, which is converted into local time, and you need to know the time zone where the user is located.
In figure 5, the header field is shown in the following figure:
1.7 several important concepts of HTTP 1.7.1 connection: Connection
The actual circulation of a transport layer, which is built between two applications that communicate with each other.
It is possible to have a connection header in both the http1.1,request and server headers, which means how to handle long links when client and server communicate.
In http1.1, both client and server support long links by default. If client uses the http1.1 protocol but does not want to use long links, you need to specify the value of connection as close; in header. If the server party does not want to support long links, you also need to specify that the value of connection is close in response. Whether the header of request or response contains a connection with a value of close, it indicates that the tcp link currently in use will be broken after the request has been processed on the same day. In the future, when client makes a new request, a new tcp link must be created.
1.7.2 message: Message
The basic unit of HTTP communication that consists of a structured sequence of octets and is transmitted over a connection.
1.7.3 request: Request
A request information from the client to the server includes the method applied to the resource, the identifier of the resource, and the version number of the protocol.
1.7.4 response: Response
A message returned from the server includes the version number of the HTTP protocol, the status of the request (such as "successful" or "not found"), and the MIME type of the document.
1.7.5 Resources: Resource
A network data object or service identified by URI.
1.7.6 entity: Entity
A special representation of a data resource or an echo from a service resource that may be surrounded by a request or response message. An entity includes entity header information and the content of the entity itself.
1.7.7 client: Client
An application that establishes a connection for the purpose of sending a request.
1.7.8 user agent: UserAgent
Initialize a requesting client. They are browsers, editors, or other user tools.
1.7.9 Server: Server
An application that accepts a connection and returns information to a request.
1.7.10 Source server: Originserver
Is a server on which a given resource can reside or be created.
1.7.11 Agent: Proxy
An intermediate program that can act as either a server or a client to set up requests for other clients. Requests are passed internally or to other servers through possible translations. Before sending a request message, an agent must interpret and rewrite it if possible.
The agent often acts as a portal to the client side that passes through the firewall, and the agent can also act as a helper application to handle requests that are not completed by the user agent through the protocol.
1.7.12 Gateway: Gateway
A server that acts as an intermediary for other servers. Unlike the proxy, the gateway accepts requests as if it were the source server for the requested resource; the requesting client is not aware that it is dealing with the gateway.
Gateways often act as server-side portals that pass through firewalls, and they can also act as protocol translators to access resources stored in non-HTTP systems.
1.7.13 Channel: Tunnel
Is an intermediary program that acts as two connection relays. Once activated, the channel is not considered to be part of HTTP communication, although the channel may have been initialized by a HTTP request. When both ends of the trunked connection are closed, the channel disappears. Channels are often used when a Portal must exist or when an Intermediary cannot interpret relay traffic.
1.7.14 Cache: Cache
Local storage of response information.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 232
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.