What is the use of DNS and WebSocket in web protocol 07/09 Update SLTechnology News&Howtos

What is the use of DNS and WebSocket in web protocol

2025-07-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article mainly introduces what is the use of DNS and WebSocket in web protocol. It is very detailed and has certain reference value. Friends who are interested must read it!

1. DNS1, Linux dig command

Let's first learn how DNS does domain name resolution through the dig command under Linux. Let's first enter the command:

Dig www.baidu.com

Take a look at the red box of the label, which represents from left to right:

The name of the domain name is the name of the server

Network type, DNS protocol is designed to take into account other network types, but the current location value is still a dead IN, you can understand it as the Internet. This value is generally the same.

Identify the type of address that the domain name corresponds to, and A represents the address of ip.

Some people here may ask, why is there a "." behind this domain name? What we entered is clearly www.baidu.com, not www.baidu.com. Ah.

I would like to mention here:

At the end. It represents the root domain name, and each domain name has a root domain name, so we usually omit it.

The next level of the root domain name is called the top-level domain name, such as the well-known .com and .net.

The next level is the secondary domain name, such as .baidu in the example. You can register this sub-domain name as long as you have money.

Finally, this www, this represents the third-level domain name. It is usually the name assigned to the server by the user in his own domain. Users can share with him as much as they like.

So you can see that the domain names here are hierarchical. If you understand this, you can figure out why the query process of DNS is hierarchical query.

We can use the dig+trace command to completely restore a hierarchical query process:

You can see that the process of DNS query can be clearly understood by command. This is far more intuitive than searching for something on the Internet that DNS is a recursive query. Some sharp-eyed friends here may ask, what is this CNAME for? You just need to understand that CNAME is mainly used to accelerate CDN. For more details, you can go to Wikipedia, where it is very clear that this article will not be carried out on this knowledge point due to the limitation of space.

2. WireShark learns to understand DNS messages

Note here that because Wireshark's capture filter cannot set the DNS protocol, and because DNS is based on the UDP protocol, let's just set the capture filter to UDP.

Then we can find the UDP message we want to see in a pile of DNS messages, and we type www.airbnb.com in the browser:

Note that there are two arrows on the left, the arrow to the right represents the request, and the arrow to the left represents the reply to the request.

After these DNS messages have been parsed by Wireshark, the format has been analyzed for us, so it looks very clear. It's simple, too. Here we no longer analyze the binary message format of DNS in detail, and those who are interested can find the relevant information on their own. When we captured the screenshot of the DNS message shown above, the careful students have found that the query address of our DNS message is 172.22.3.102. Generally speaking, most corporate internal networks provide a unified DNS server. This address is the internal DNS server address, as shown in the figure:

Of course we can also use other DNS queries, such as the famous Google DNS

3. The disadvantages of traditional DNS service query.

After the above analysis, it seems that the query process of DNS is relatively simple, but in fact DNS brings a lot of performance or security problems. Let's first restore the complete DNS query process (suppose we want to visit csdn's website):

The browser enters a domain name address and returns it directly if the Ip address of the domain name is in the operating system's DNS cache. If not, go to the second step.

The operating system will send a DNS query message to the DNS server address in the tcp/ip parameters set by the system. Notice this server. We usually call it a local DNS server. This is 172.22.3.102 in our screenshot above.

If the local DNS server has the ip address for this domain name in its cache, return it directly. If not, proceed to the next step.

First, look at the architecture diagram of the DNS server:

In other words, when our local DNS server does not have the ip address of the domain name in the cache, the local DNS server will directly query the root DNS server (there are only 13 in the world), and then the root DNS server will analyze the domain name and tell our local DNS server that you should go to the .net DNS server to query. Then the. Net DNS server tells the local DNS server that you should go to the csdn.net DNS server to query the DNS address. Finally, csdn's DNS server will return the correct ip address to our local DNS, and the local DNS will return this value to our browser (this process can be more intuitively realized by using the previous dig+trace command).

From the above complete DNS interaction process, we can draw at least three conclusions:

The DNS server can do load balancing. Of course, the prerequisite is that you have to build your own DNS server for this domain name. Generally speaking, big factories will build themselves.

DNS query is a recursive process, weak network situation, this time will become very long. And DNS uses UDP transport protocol, weak network may directly query failure.

The query process of DNS is out of control, for example, the local DNS server can return an incorrect ip address. For example, if you visit a link of JD.com, and then return to your ip address is spelling many.

This is just a superficial disadvantage of traditional DNS queries. In fact, most of our daily traffic comes from the mobile network. In mobile networks, traditional DNS services expose more problems:

We said earlier that the local DNS server will cache the ip of the domain name, but we can't control the validity of the cache. It all depends on the integrity of the operator. It is possible that our ip address has changed, but the local DNS server still returns the old IP.

In order to save the cost of traffic calculation between operators and operators, some operators will secretly cache some static pages. When users visit these pages, they often access the address of the cache server of these static pages. At this time, no matter how much our page has been updated, it is an old page for users.

In some scenarios, such as densely populated subway stations, concerts, football fields, etc., once operators find that they have too many users and the local DNS server is under great pressure, they will manually set the local DNS server to the root domain name server.

The process of querying and then recursively querying DNS is modified as follows: if you directly query the DNS server address of another operator (assuming the operator's name is B), the DNS server of B will return an address of B. at this time, the ip address accessed by the user of operator An is the ip of operator B. the speed of cross-operator access will be very slow.

The NAT services of some broadband providers are very unstable. We all know that when we surf the Internet at home, the local address is actually an internal network address. We can access the external network because these broadband providers provide a layer of gateway to take charge of the NAT. This NAT will convert our internal network address into an external network address, the ip after the NAT. Some authoritative DNS servers cannot tell which carrier the ip belongs to. It will also bring problems of cross-operator access.

With the exception of your own DNS server, the cache expiration of other public DNS servers is not controllable, which will have an impact on strategies such as dual-server room deployment, multi-location activity, and multiple domain names.

In weak network environment, because the transport protocol used by DNS is unreliable UDP, and because the process of DNS query is a recursive process, DNS query is likely to fail in weak network environment.

4 、 HTTPDNS

Based on the above shortcomings, more and more large companies are using this technology of HTTPDNS (according to Tencent's public data, Tencent's daily localDNS failures have reached 80w in 15 years. After accessing HTTPDNS, the aPCge user access latency has decreased by more than 10%, the access failure rate has decreased by more than 1/5, and the effect of user access experience has been significantly improved):

The principle of this technology is actually quite simple, it is nothing more than asking our mobile phone App to initiate a HTTP request (most of this request address is directly connected by ip, if you use a domain name, then there is still a traditional DNS problem for this request), this request can carry the user's carrier, geographical location, accurate to the province and city, and then the server returns a best ip address to App based on this information. Then App sets the domain name-ip mapping to our okhttp. In this way, most of the requests in the phone will directly use the ip address returned by our HTTP server instead of the operator's address.

Note that the reason I'm talking about most of the requests rather than all of them is that for Android systems, the DNS query process code of webview is all in layer c, and there are some differences between versions, and this part of the hook process is extremely difficult. As of the writing of this article, the author still has not queried the open source code that can hook webview DNS, and iOS is obviously doing better than Android. To iOS, webview's HTTP is a normal HTTP request, no different from native code. For Android clients, it is not particularly easy to access HTTPDNS. Even now that I have okhttp.

Option 1:

Through the interceptor of okhttp, we can directly replace the domain name in our url with ip before sending the request, and then manually add host header information to header. Cons: if the url is https, there will be a certificate verification problem in the ip direct connection. In addition, because we directly use the ip when making the request, but the set cookie header information returned by the server is accompanied by the domain name, we also need to do extra processing here. Advantages: because it is the implementation mechanism of the interceptor, it is easy to do switches for degradation.

Option 2:

Take over directly through okhttp's DNS.

Public class HttpDNS implements DNS {private static final DNS SYSTEM = DNS.SYSTEM; @ Override public List lookup (String hostname) throws UnknownHostException {/ / suppose this DNShelper can return the result of our httpDNS query String ip = DNSHelper.getIpByHost (hostname); if (ip! = null & &! ip.equals (")) {List inetAddresses = Arrays.asList (InetAddress.getAllByName (ip)); return inetAddresses } return SYSTEM.lookup (hostname);}} / / then let okhttp use our DNS implementation OkHttpClient client = new OkHttpClient.Builder () .DNS (new HttpDNS ()) .build ()

This scheme does not have the disadvantage of interceptor, because in essence, this scheme is no different from the DNS query scheme of the system, except that UDP goes to localDNS to find it, and we use HTTP to find it on the HTTP server. This solution can solve all the shortcomings of solution 1, but one problem is that once there is a problem with the result returned by the HTTPDNS, it is difficult to downgrade. And okhttp's DNS query also has a cache. Once the address returned by our HTTP DNS server is incorrect, there will be problems with subsequent access to this domain name within a certain period of time.

We mentioned earlier that Android's own webview mechanism makes it difficult for HTTPDNS to play a role in webview, but there are still some ways to avoid the slow speed of loacalDNS in webview. For example, we can set up DNS requests to preload static resources in html without having to wait until those resources are actually requested to look for DNS.

Considering that in fact, the DNS cache used by the code of webview and App is the same storage area in the operating system, we can also count the domain names of url frequently requested in our commonly used web pages, and visit these domain names in advance as soon as App starts, so that when the hot web pages are loaded, if the operating system DNS cache already has the corresponding ip, you can omit a DNS query.

5. Is DNS really based on UDP protocol?

In fact, the DNS protocol is not entirely based on the UDP protocol. The DNS protocol actually includes the concept of the main DNS server and the secondary DNS server. When the secondary DNS server starts, it will actively go to the main DNS server to pull the latest DNS information in this region. This pull process uses the TCP protocol, not the UDP protocol. It is also called zone transfer in the agreement document.

Some people here may wonder why the UDP protocol is not used to complete this process, because the UDP protocol can only transmit a maximum of 512 byte data, and the auxiliary DNS can easily exceed the maximum number of messages to pull DNS information in this region, so the TCP protocol is used here to complete the data pull operation.

Second, WebSocket1, with HTTP polling, why do you still need WebSocket technology?

Many people don't understand why it is necessary to use WebSocket. It is obvious that I can complete the requirement by polling the HTTP request. This sentence itself is not wrong, it is true that all the places where you can use WebSocket can be replaced by polling HTTP requests. But the efficiency behind it is very different.

We can think of WebSocket as a big patch made by the HTTP protocol to support persistent connections. It has something in common with HTTP and is an improved design to solve some problems that HTTP itself cannot solve. In the previous HTTP protocol, the so-called keep-alive persistent connection refers to the completion of multiple TCP requests in a single HTTP connection, but each request still needs to be sent separately by header;. The so-called polling refers to continuously actively sending HTTP requests from the client to the server to query whether there is any new data. This model has three disadvantages:

In addition to the real data part, the server and client also exchange a large number of HTTP header, the efficiency of information exchange is very low.

Because HTTP is stateless, every request server has to use the parameters passed by the client to query whose request it belongs to, for example, to check how many deposits are under the userId, how many mobile phones have been bought, and so on, which is a waste of valuable computing resources on the server.

The polling time interval is not well set, the setting is high, the user interface response is not timely, the setting is too low, and it is afraid that the traffic consumption is too large, and the server can not bear it.

Of course, polling also has the advantage of extremely low implementation cost, almost no additional development costs on the client and server. WebSocket still needs to make some infrastructure changes when it is used for the first time (for example, the corresponding configuration of nginx). Implementation cost of WebSocket: although modern server programming provides WebSocket implementation by default, we know that considering extensibility and other factors, we usually do not deal directly with the source server, but with the proxy server. The same is true for WebSocket, so there is a technical cost for the team that implements WebSocket for the first time.

The above figure is a simple server architecture diagram. The requests sent by the client are forwarded to the corresponding source server one by one after passing through a proxy server that specializes in load balancing. For WebSocket, the situation is a little more complicated:

Compared with pure HTTP, WebSocket usually adds a special message distribution system to improve the efficiency of message processing. It's usually Kafka or RabbitMQ.

2. Wireshark parses WebSocket messages

Let's first take a look at the frame format of WebSocket. Let's first set up the catcher for Wireshark:

We can see that the total operation steps here are connect, then send a message, the server returns the message we sent, and finally we take the initiative to disconnect.

WebSocket is a frame-based protocol, so here we focus on analyzing the frame format of WebSocket. The 4th-7th bit bits of each frame header represent Opcode. What is more important are several values:

2: it means this is a binary frame.

1: indicates that this is a text frame

8: means to close the frame.

3. The process of establishing WebSocket connection

Someone here is about to ask, since WebSocket can guarantee a persistent connection (tcp), who initiated this persistent connection? Look at the following picture:

In addition, we also need to pay attention to the two header messages Sec-WebSocket-Accept and Sec-WebSocket-Key.

The client generates a random number, encrypts it with base64 and puts it in the Sec-WebSocket-Key header message, and then the server receives the message and uses this value to spell with a magic string specified in rfc: "258EAFA5-E914-47DA-95CA-C5AB0DC85B11". Then it uses sha-1 encryption and the calculated value after base64 is put into the Sec-WebSocket-Accept header and returned to the client.

The reason for doing this is to bring some basic guarantees. As we said earlier, the establishment of a WebSocket connection relies on HTTP messages. In order to prevent the establishment of the WebSocket connection from being inadvertently triggered by the caller or other abnormal conditions, there is an additional data verification process.

4. The disconnection process of WebSocket connection

After looking at the connection, let's take a look at the disconnection. Unlike the WebSocket connection, there are clear steps to disconnect the WebSocket. You need to disconnect the WebSocket first, and then disconnect the tcp.

You can see in the figure that the heartbeat packet is sent every 30 seconds, and instead of using the so-called ping pong heartbeat frame of 0x9 or 0xA agreed in rfc, it is represented by the simplest text frame.

As shown on the left, the heartbeat packet initiated by the WebSocket server is on the left. The value of opcode is still the meaning of the text text frame, but the content of the text is very special. On the right is the heartbeat packet replied by the WebSocket client.

5. Proxy cache pollution of WebSocket

Note here that when Wireshark grabs the packet, there is a masked logo on the far right, which usually means that the frame of this WebSocket is sent by the client to the server, which is the identity of a mask. In WebSocket protocol, any message initiated by the client must be calculated by the mask of this random masking-key before it can be transmitted. This is to solve the problem of proxy cache pollution.

Note that the core of the problem here is the improper implementation of the proxy server. The so-called improper implementation of the proxy server refers to the proxy server that does not fully implement the WebSocket protocol. Instead of a malicious proxy server in the real sense, it is inevitable to use mask frame technology.

The so-called mask mask technology means that the browser must generate a random mask-key when sending the WebSocket frame, and the transmitted content will be different or operated with this mask-key in the binary of the frame. The resulting value can be transmitted in the network.

When our server receives this WebSocket frame, we can use this mask-key to reverse XOR, so we can get the real content. This is the lowest cost solution to detect whether the WebSocket frame has been tampered with. For example, we use WebSocket to transmit a text frame, and the hexadecimal of the ascii code containing the string vivo,vivo is: 7669 766f. In this message, the mask-key generated by the browser this time is 23. 68 c0a3.

We XOR these two values:

You can get a value of 5501 b6 cc. Then check to see if this is the value in the frame content of the grab packet:

The above is all the contents of the article "what are DNS and WebSocket in web Agreement?" Thank you for reading! Hope to share the content to help you, more related knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.