In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >
Share
Shulou(Shulou.com)11/24 Report--
I remember when I first worked, when I first came into contact with the RPC protocol, I was very confused at that time. I used the HTTP protocol very well, so why do I still use the RPC protocol?
So I searched it on the Internet.
Many explanations seem to be very official. I believe you have seen it on various platforms and seem to have not explained it. They are all using a concept that we do not know to explain another concept that we do not know. People who understand do not need to read. People who do not understand still do not understand.
This kind of look, but also do not seem to see the feeling, the cloud in the fog is very uncomfortable, I understand.
In order to avoid the intense fatigue of judging ugliness, today we will try to talk about it in a different way.
Starting with TCP, as a programmer, if we need to send a piece of data from computer A to computer B, we usually use socket in our code.
At this point, we usually choose between TCP and UDP. TCP is reliable, UDP is not. Unless he is a master programmer like Ma (UDP was heavily used by QQ in the early days), as long as there is a slight demand for reliability, ordinary people generally have no brain to choose TCP.
Something like this.
Fd = socket (AF_INET,SOCK_STREAM,0); where SOCK_STREAM, which refers to the use of byte streams to transfer data, is, to put it bluntly, TCP protocol.
After defining the socket, we can happily manipulate the socket, such as binding the IP port with bind () and initiating the connection with connect ().
▲ handshake connection establishment process
After the connection is established, we can use send () to send data and recv () to receive data.
Just such a naked TCP connection, you can send and receive data, is that enough?
No, it will be a problem if you use it this way.
What are the problems with using naked TCP? eight-part essays are often memorized. TCP has three characteristics: connection-oriented, reliable, and based on byte flow.
What is ▲ TCP?
These three characteristics are really very incisive, and we have not memorized this eight-part essay.
Each feature expansion can talk about an article, but today we need to focus on byte streams.
Byte flow can be understood as data flowing in a two-way channel, which is actually what we often call binary data, simply a lot of 01 strings. There is no boundary between these 01 strings sent and received by naked TCP, and you have no idea where to get a complete message.
▲ 01 binary byte stream
Because of this no boundary feature, when we choose to use TCP to send "Charlotte" and "special annoyance", the receiver receives "Charlotte annoyance". At this time, the receiver does not distinguish whether you want to express "Charlotte" + "special annoyance" or "Charlotte" + "annoyance".
▲ message comparison
This is the so-called sticky bag problem, and I have written a special article about it before.
The purpose of saying this is to tell you that pure naked TCP cannot be used directly, and you need to add some custom rules to distinguish message boundaries.
So we will package each piece of data to be sent, such as adding a message header, and the length of a complete packet is clearly written in the message header, according to which we can continue to receive data. after intercepting it, they are the message body that we really want to transmit.
▲ message boundary length flag
And the headers mentioned here can also put all kinds of things, such as whether the message body has been compressed and the format of the message body, as long as the upstream and downstream have agreed on each other, this is the so-called agreement.
Every project that uses TCP may define a set of protocol resolution standards like this. They may be different, but the principles are similar.
So based on TCP, there are a lot of protocols, such as HTTP and RPC.
HTTP and RPC, let's go back to the hierarchical map of the network.
▲ layer 4 network protocol
TCP is a transport layer protocol, while based on HTTP and various RPC protocols created by TCP, they are only application layer protocols that define different message formats.
HTTP Protocol (Hyper Text Transfer Protocol), also known as Hypertext transfer Protocol. We use more, usually surf the Internet in the browser to click a URL to access the web page, here is the HTTP protocol.
▲ HTTP call
RPC (Remote Procedure Call) is also called remote procedure call. It is not a specific protocol in itself, but a way to invoke it.
For example, we usually call a local method like this.
Res = localFunc (req) if now this is not a local method, but a method remoteFunc exposed by a remote server, wouldn't it be beautiful if we could still call it like a local method, so that we could mask some network details and make it more convenient to use?
Res = remoteFunc (req)
▲ RPC can call remote methods as if they were local methods
Based on this idea, bosses have created a lot of RPC protocols, such as the more famous gRPC,thrift.
It is worth noting that although most RPC protocols use TCP at the bottom, in fact, they do not have to use TCP. Instead, they can use UDP or HTTP.
▲ HTTP and RPC protocols based on TCP protocol
At this point, let's return to the question of the title of the article.
If there is a HTTP protocol, why is there still a RPC?
In fact, TCP is a protocol that came out in the 1970s, while HTTP only became popular in the 1990s. There will be problems with using naked TCP directly, and you can imagine how many custom protocols there have been over the years, including RPC from the 1980s.
So what we should ask is not why there is RPC when there is HTTP protocol, but why there is HTTP protocol when there is RPC.
So if you have RPC, why do you still have HTTP?
Nowadays, all kinds of networking software installed on computers, such as xx Butler and xx Guardian, as clients (client) need to establish a connection with the server (server) to send and receive messages. At this time, they all use application layer protocols. Under this client/server (client/server) architecture, they can use the home-made RPC protocol, because it only connects to its own company's server to ok.
But there is a software difference. Browsers (browser), whether chrome or IE, need to be able to access not only their own company's server, but also other companies' web servers, so they need to have a unified standard, otherwise they can't communicate. Therefore, HTTP was the protocol used to unify browser/server (bUnix s) at that time.
That is to say, many years ago, HTTP was mainly used for b / s architecture, while RPC was more used for c / s architecture. But now the distinction is not so clear, b / s and c / s are slowly merging. Many software supports multiple terminals at the same time, such as a certain cloud disk, which not only supports web version, but also supports mobile phone and pc. If all communication protocols use HTTP, then the server only uses the same set. And RPC began to retreat behind the scenes, generally used in the company's internal cluster, the communication between the various micro-services.
In that case, you can use HTTP, but what kind of RPC do you use?
As if we are back to the beginning of the article, we should start with the difference between them.
What's the difference between HTTP and RPC? let's take a look at some of the obvious differences between RPC and HTTP.
Service discovery
First of all, to make a request to a server, you have to establish a connection, and to establish a connection, you need to know the IP address and port. The process of finding the IP port corresponding to the service is actually service discovery.
In HTTP, if you know the domain name of the service, you can resolve the IP address behind it through the DNS service, which defaults to port 80.
In the case of RPC, there are some differences. Generally, there is a special intermediary service to store the service name and IP information, such as consul or etcd, or even redis. To access a service, go to these intermediate services to get IP and port information. Because dns is also a kind of service discovery, there are also components that do service discovery based on dns, such as CoreDNS.
It can be seen that the service discovery part, there are some differences between the two, but not quite high or low.
Underlying connection form
Take the mainstream HTTP1.1 protocol as an example, it defaults to maintaining the keep alive after the underlying TCP connection is established, and subsequent requests and responses will reuse the connection.
The RPC protocol, similar to HTTP, also exchanges data through the establishment of TCP long links, but the difference is that the RPC protocol generally builds a connection pool. When the number of requests is large, multiple connections are established and placed in the pool. When you want to send data, take a connection out of the pool, put it back after use, and reuse it next time. It can be said to be very environmentally friendly.
▲ connection_pool
Because connection pooling helps to improve the performance of network requests, HTTP is added to the network libraries of many programming languages, such as go.
It can be seen that there is not much difference between the two, so it is not the key.
Content of transmission
Messages transmitted based on TCP are, in the final analysis, nothing more than header header and message body body.
Header is used to mark some special information, the most important of which is the body length of the message.
Body is the content that we really need to transmit, and this content can only be binary 01 strings, after all, computers only know this thing. So it's not a big problem for TCP to pass strings and numbers, because strings can be encoded and then turned into 01 strings, and the numbers themselves can be converted directly to binary. But as for the structure, we have to find a way to convert it to binary 01 string, and there are many ready-made solutions, such as json,protobuf.
The process of converting a structure into a binary array is called serialization, and the process of restoring a binary array into a structure is called deserialization.
▲ serialization and deserialization
For mainstream HTTP1.1, although it is now called hypertext protocol and supports audio and video, HTTP was originally designed to display web text, so its content is mainly string. This is true for both header and body. In the body section, it uses json to serialize the structure data.
We can just take a picture and take a look.
▲ HTTP message
You can see that there is a lot of redundancy in the content, which is very verbose. Most obviously, like the information in header, if we agree that the first bit of the header is content-type, we don't need to pass the "content-type" field every time, and the similar situation is particularly obvious in the json structure of body.
On the other hand, RPC, because it is more customized, can use smaller protobuf or other serialization protocols to store structural data, and it does not need to consider various browser behaviors like HTTP, such as 302 redirect jumps. As a result, the performance will be better, which is the main reason for abandoning HTTP in the company's internal microservices and choosing to use RPC.
▲ HTTP principle
▲ RPC principle
Of course, the HTTP mentioned above actually refers to the HTTP2 that the mainstream HTTP1.1,HTTP2 has made a lot of improvements based on the former, so the performance may be better than many RPC protocols, even the HTTP2 is directly used at the bottom of the gRPC.
So here's the problem again.
Why do we have to have RPC protocol when we have HTTP2?
This is because HTTP2 came out in 2015. At that time, many companies' internal RPC agreements had been running for many years, and for historical reasons, it was generally not necessary to change them.
Summary nude TCP can send and receive data, but it is a borderless data flow. The upper layer needs to define the message format to define the message boundary. So there are various protocols, HTTP and all kinds of RPC protocols are the application layer protocols defined on TCP.
RPC is not a protocol in nature, but a way of calling, and concrete implementations such as gRPC and thrift are the protocols that implement RPC calls. The goal is to expect programmers to invoke remote service methods in the same way that they call local methods. At the same time, there are many ways to implement RPC, which does not have to be based on the TCP protocol.
Historically, HTTP is mainly used for b / s architecture, while RPC is more used for c / s architecture. But now the distinction is not so clear, b / s and c / s are slowly merging. A lot of software supports multi-terminal at the same time, so HTTP protocol is generally used for external communication, while RPC protocol is used for communication between micro-services in internal clusters.
RPC actually appeared earlier than HTTP and has better performance than the current mainstream HTTP1.1, so RPC is still used internally in most companies.
HTTP2.0 is optimized on the basis of HTTP1.1, and its performance may be better than that of many RPC protocols, but since it has only come out in recent years, it is unlikely to replace RPC.
Finally, let's leave a question. Have you found that both HTTP and RPC have a characteristic, that is, messages are client requests and server responses. If the client does not ask, the server will definitely not answer, which is a bit stiff, but in reality, there must be scenarios where the downstream actively sends messages to the upstream, such as playing a web game, standing there doing nothing, and strange will take the initiative to attack me. What should I do in this case?
Referenc
Https://www.zhihu.com/question/41609070
This article comes from the official account of Wechat: rookie debug (ID:xiaobaidebug), author: Xiaobai
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.