Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

HTTP content negotiation

2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/02 Report--

A URL often needs to represent several different resources. For example, Web sites that need to provide their content in multiple languages. If a site has both French-speaking and English-speaking users, it may want to provide site information in both languages. Ideally, the server should send an English version to English users and a French version to French users-users can get the content in the corresponding language by visiting the home page of the website

HTTP provides a method of content negotiation that allows clients and servers to make such decisions. Through these methods, a single URL can represent different resources (for example, French and English versions of the same website page), and these different versions are called variants. This article will introduce content negotiation in detail.

Summary

For a particular URL, the server can follow some principles to decide what is most appropriate to send to the client. In some cases, the server can even automatically generate custom pages. For example, the server can convert HTML pages into WML pages for handheld devices. This kind of dynamic content transformation is called transcoding. These transformations are the result of content negotiation between the HTTP client and the server.

There are three different ways to determine which page on the server is best for the client: let the client choose, the server automatically determine, or let the intermediate agent choose. These three technologies are called client-driven negotiation, server-driven negotiation and transparent negotiation, respectively.

Client driver

For the server, it is easiest for the server to just send back a response when it receives a client request, listing the available pages and letting the client decide which one to see. Obviously, this is the easiest way for the server to implement, and the client is likely to choose the best version (as long as there is enough information in the list for the client to choose). The downside is that each page requires two requests: the first time to get the list and the second time to get the selected copy. This kind of technology is slow and tedious, which annoys users.

In terms of implementation principle, the server actually has two ways to provide options for the client: one is to send back an HTML document with links to various versions of the page and a description of each version, and the other is to send back a HTTP/1.1 response with a 300 Multiple Choices response code. When the client browser receives this response, in the former case, a page with a link is displayed, and in the latter case, a conversation window may pop up for the user to make a choice. In any case, the decision is made by the browser user on the client side

In addition to increasing latency and making tedious multiple requests for each page, this approach has another drawback: it requires multiple URL: one for public pages and one for each other special page. So, for example, a server with the original request address of www.joes-hardware.com,Joe might reply to a page with links to www.joes-hardware.com/english and www.joes-hardware.com/french. If the client wants to bookmark, do you want to add it to the original public page or to the selected page? If users want to recommend the site to their friends, should they tell www.joes-hardware.com this address or just their English-speaking friends www.joes-hardware.com/english?

Server driver

One way to reduce extra traffic is to let the server decide which page to send back, but to do this, the client must send enough information about the customer's preferences so that the server can make an accurate decision. The server obtains this information through the first set requested by the client.

There are two mechanisms for the HTTP server to evaluate what response is appropriate to send to the client

1. Review the content negotiation first set. The server looks at the Accept header set sent by the client and tries to match it with the corresponding response header.

2. Make modifications according to other (non-content negotiation) headers. For example, the server can send a response based on the User-Agent header sent by the client

[content negotiation first set]

Clients can send user preference information using the HTTP header set listed below

The first part describes what media type Accept tells the server to send. Accept-Language tells the server what language to send. Accept-Charset tells the server what character set to send. Accept-Encoding tells the server what encoding to use.

[note] these headers are very similar to entity headers. However, the uses of the two titles are very different. The entity header sets are like transport labels, which describe the various message body attributes that are necessary in the process of transmitting a message from the server to the client. The first set of content negotiation is sent by the client to the server to exchange preference information, so that the server can select the one that best suits the client's preference from different versions of the document to provide services.

The server matches the client's Accept header set with the entity header set listed below

Accept header entity header Accept Content-TypeAccept-Language Content-LanguageAccept-Charset Content-TypeAccept-Encoding Content-Encoding

Because HTTP is a stateless protocol, which means that the server does not track client preferences between different requests, the client must send its preference information in each request

If both clients send Accept-Language headers describing the language information they are interested in, the server can decide which version of www.joes-hardware.com to send to which client. Letting the server automatically select the documents to be sent back reduces the round-trip communication delay, which can not be avoided in the client-driven model.

However, assuming that a client prefers Spanish, which version of the page should the server send back? English or French? The server has only two choices: guess or fall back to the client-driven model and ask the client which one to choose. If the Spaniard happens to know a little English, he may choose an English page, which is not ideal, but it can solve the problem. In this case, the Spaniard needs to have a way to convey more information about his preference, that is, he does know a thing or two about English, even when there is no Spanish.

Fortunately, HTTP provides a mechanism for clients similar to this Spaniard to describe their preferences in more detail. This mechanism is the quality value (Q value for short).

Quality values are defined in the HTTP protocol, allowing clients to list multiple options for each preference category and associate a priority for each preference. For example, a client can send an Accept-Language header in the following form:

Accept-Language: en; QQ 0.5, fr; QQ 0.0, nl; QQ 1.0, tr; QQ 0.0

The Q value ranges from 0.0 to 1.0 (0.0 is the lowest priority and 1.0 is the highest priority). The first part listed above indicates that the client is most willing to receive Dutch (abbreviated as nl) documents, but English (abbreviated as en) documents are also fine; in any case, the client does not want to receive French (fr) or Turkish (tr) versions

[note] the order of preferences is not important, only the Q value related to preferences is important

The server occasionally encounters situations where the document cannot be found to match any of the client's preferences. In this case, the server can modify the document, that is, transcode the document to match the client's preferences

[other first sets]

The server can also match the response based on other client request header sets, such as User-Agent headers. For example, the server knows that older browsers do not support JavaScript, so it can send it a version of the page that does not contain JavaScript

In this case, there is no Q-value mechanism to find a "closest" match. The server either looks for an exact match, or simply gives what it has, depending on the implementation of the server.

Because the cache needs to make every effort to provide the correct "best" version of the cached document, the HTTP protocol defines the Vary header that the server sends in the response. This header tells the cache, as well as the client and all downstream proxies, according to which headers the server decides the best version to send the response.

[Apache]

Here is an overview of how the famous Web server Apache supports content negotiation. The content provider of the site, such as Joe, is responsible for providing different versions of Joe's indexed pages. Joe must also place these index page files in the appropriate directory of the Apache server associated with the site. Content negotiation can be enabled in two ways

1. In the site directory, create a type-map (type mapping) file for each variant URI in the site. This type-map file lists each variant and its associated content negotiation header set

2. Enable the MultiViews directive, which causes Apache to automatically create type-map files for the directory

[use type-map file]

The Apache server needs to know the naming convention for type-map files. You can set handler in the server's configuration file to describe the suffix name of the type-map file. For example:

AddHandler type-map .var

This line shows that a file with a .var suffix is a type-map file.

Here is an example of a type-map file

According to this type-map file, the Apache server knows to send joes-hardware.en.html to the client requesting the English version and joes-hardware.fr.de.html to the client requesting the French version. Apache servers also support quality values

[use MultiView]

In order to use MultiView, you must use the OPTION directive to enable it in the appropriate section (, or) in the access.conf file under the website directory

If MultiView is enabled and the browser requests a resource named joes-hardware, the server searches for all files with joes-hardware names and creates type-map files for them. The server will guess its corresponding content to negotiate the first set based on the name. For example, the French version of joes-hardware should contain .fr

Another way to implement content negotiation on the server side is to use server-side extensions, such as Microsoft's dynamic server pages (Microsoft's Active Server Pages, ASP)

Transparent negotiation

The transparent negotiation mechanism attempts to remove the load required by the server driver negotiation from the server, and uses an intermediate proxy to represent the client to minimize message exchange with the client. It is assumed that the agent knows the expectations of the client so that it can negotiate with the server on behalf of the client. When the client requests the content, the agent has received the expectation of the client.

To support transparent content negotiation, the server must be able to tell the agent which request headers the server needs to check in order to best match the client's request. There is no transparent negotiation mechanism defined in the HTTP/1.1 specification, but the Vary header is defined. The server sends Vary headers in the response to tell intermediate nodes which request headers need to be used for content negotiation

The proxy cache can save different copies of a document accessed through a single URL. If the server passes their decision process to the cache, these agents can negotiate with the client on behalf of the server. Caching is also a good place for content transcoding, because the universal transcoder deployed in the cache can transcode content from any server, not just one server.

[cache and backup candidate]

Content is cached on the assumption that it can be reused later. However, to ensure that the correct cached response is sent back to the client request, the cache must apply most of the decision logic used by the server when echoing the response

The Accept header sets sent by the client are described above, and the corresponding entity header sets used by the server to match these header sets are used by the server to select the best response for each request. The cache must also use the same header set to decide which cached response to send back

The following figure shows the sequence of correct and incorrect operations involving caching. The cache forwards the first request to the server and stores its response. For the second request, the cache found a matching document based on URL requests. However, this document is in French, while the requestor wants the Spanish version. If the cache simply sends the French version of the document to the requestor, it makes a mistake

Therefore, the cache should also forward the second request to the server and save the response of the URL and the "alternate candidate" response. The cache now holds two different documents for the same URL, the same as on the server. These different versions are called variants (variant) or alternate candidates (alternate). Content negotiation can be seen as the process of selecting the most appropriate variant for a client request

[Vary header]

Here are some typical requests and responses sent by browsers and servers

However, what if the server's decision is not based on the Accept header set, but on, for example, the User-Agent header? It's not as extreme as it sounds. For example, the server may know that older browsers do not support the JavaScript language, so it may send back a version of the page that does not contain JavaScript. If the server decides which page to send based on other headers, the cache must know what those headers are so that it can make the same logical judgment when selecting the returned page.

All client request headers are listed in HTTP's Vary response header, which the server can use to select documents or generate customized content (outside of the regular content negotiation header set). For example, if the document provided depends on the User-Agent header, the Vary header must contain User-Agent

When a new request arrives, the cache negotiates the first set based on the content to find the best match. But before providing the document to the client, it must check to see if the server sent the Vary header in the cached response. If there are Vary headers, the values of those headers in the new request must be the same as the corresponding headers in the old cached request. Because the server may change the response according to the header of the client request, in order to achieve transparent negotiation, the cache must save the client request header and the corresponding server response header for each cached variant, as shown in the following figure

If the Vary header of a server looks like this, a large number of different User-Agent and cookie values will produce a lot of variations:

Vary: User-Agent, Cookie

The cache must save its corresponding document version for each variant. When the cache performs a lookup, it first matches the content of the content negotiation header set, and then compares the requested variants with the cached variants. If there is no match, the cache fetches the document from the original server

Transcoding

We have discussed a mechanism that allows the client and server to select the most suitable document for the client from a series of documents in a URL. The premise for implementing these mechanisms is that there are documents that meet the needs of the client-whether fully or to some extent

However, what if the server does not have documentation that meets the needs of the client? The server can give an error response. But in theory, the server can convert an existing document into some kind of document available to the client. This option is called transcoding

Some hypothetical transcodes are listed below

Before conversion, after conversion, HTML documents, WML documents, high-resolution images, low-resolution images, color, black and white images, complex pages without many frames or images, simple text pages with Java applets HTML pages HTML pages without Java applets pages with advertisements pages without ads

There are three types of transcoding: format conversion, information synthesis, and content injection.

[format conversion]

Format conversion refers to the conversion of data from one format to another so that it can be viewed by the client. Through HTML-to-WML conversion, wireless devices can access documents that are usually viewed by desktop clients. Clients that access Web pages through slow connections do not need to receive high-resolution images. If the image file size is reduced by reducing image resolution and color through format conversion, such clients can more easily view pages with rich images.

Format conversion can be driven by the content negotiation header set, but it can also be driven by the User-Agent header. Note that content conversion or transcoding is different from content encoding or transmission coding, the latter two are generally used to transmit content more efficiently or securely, while the first two enable the access device to view the content

[information synthesis]

Extracting key pieces of information from a document is called information synthesis (information synthesis), which is a useful transcoding operation. Examples of this include generating an outline of the document based on the section title, or removing advertisements and trademarks from the page.

Classifying pages according to the keywords in the content is a more refined technique that helps to summarize the essence of the document. This technique is often used in Web page classification systems, such as the Web page directory of a portal site.

[content injection]

The two types of transcoding described earlier usually reduce the content of an Web document, but there is another type of transformation that increases the content of the document, namely content injection transcoding. Examples of content injection transcoding are automatic advertisement generator and user tracking system

Imagine how tempting and certainly annoying it is to have an ad placement transcoder that automatically adds ads to every HTML page you pass. This type of transcoding can only be done dynamically-it must instantly add ads that are relevant to the current specific user or targeted to a specific user. You can also build a user tracking system to dynamically add content to the page to collect statistical information about the way users view the page and the way they browse on the client.

[comparison between transcoding and static pre-generation]

The alternative to transcoding is to create different copies of Web pages on the Web server, such as HTML, WML;, one with high resolution and one with low resolution, one with multimedia content and one without. However, this approach is impractical for a number of reasons: any small change in a page can involve a lot of pages, require a lot of space to store different versions of each page, and make page cataloging and Web server programming (to provide the right version) more difficult. Some transcoding operations, such as ad insertion (especially targeted ad insertion), cannot be implemented statically-because what advertisement is inserted is related to the user requesting the page.

Instant conversion of a single root page is an easier solution than static pre-generation. However, this will increase the delay in providing content. However, sometimes some of these calculations can be performed by a third party, which reduces the computing load on the Web server-for example, the conversion can be done by an external Agent in the proxy or cache

The following figure shows the transcoding in the proxy cache

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report