What is RPC? 07/16 Update SLTechnology News&Howtos

What is RPC?

2025-07-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

directory

1 What is RPC?

2. Typical RPC invocation framework

3. Introduction to Thrift Framework

1 What is RPC?

RPC (remote procedure call): Remote procedure call.

Server A deploys application a and server B deploys application b. When server A calls a function or method of application b on server B, it cannot be called directly because it is not in the same memory space. The semantics of the call must be expressed through the network to convey the called data.

Since it is calling the service on machine B, then machine A can also create a service on its own. In principle, this could be done, but as computers evolved horizontally, clusters emerged, allowing multiple machines to be deployed as a cluster to provide services to the outside world. Requirements that cannot be fulfilled within a process or even on the same computer require RPC.

There are many RPC frameworks, such as the earliest CORBA, Java RMI, RPC style of Web Service, Hessian, Thrift, and even Rest API.

(2) Local procedure call procedure

RPC is to call remote functions as if they were local functions. Before looking at RPC, let's look at how local calls are tuned. Suppose we want to call Multiply to compute the result of lvalue * rvalue:

1 int Multiply(int l, int r) {

2 int y = l * r;

3 return y;

4 }

6 int lvalue = 10;

7 int rvalue = 20;

8 int l_times_r = Multiply(lvalue, rvalue);

So at line 8, we actually do the following:

Stack the values of lvalue and rvalue

Enter the Multiply function, take the values 10 and 20 from the stack and assign them to l and r.

Execute line 2, compute l * r, and store the result in y

Stack the value of y and return from Multiply

Line 8: Take the return value 200 from the stack and assign it to l_times_r

The above 5 steps are the process of executing local calls.

(3) New problems with remote procedure calls

When called remotely, the body of the function we need to execute is on the remote machine, that is, Multiply is executed in another process. This raises several new questions:

Call ID mapping. How do we tell remote machines that we want to call Multiply instead of Add or FooBar? In local calls, the function body is specified directly by the function pointer. When we call Multiply, the compiler automatically calls its corresponding function pointer for us. In remote calls, however, function pointers do not work because the address spaces of the two processes are completely different. So, in RPC, all functions must have an ID of their own. This ID is uniquely identifiable in all processes. The client must attach this ID when making remote procedure calls. Then we also need to maintain a {function Call ID} mapping table on the client side and a corresponding table on the server side. The tables need not be identical, but the Call IDs for the same functions must be identical. When the client needs to make a remote call, it looks up this table, finds the corresponding Call ID, and then passes it to the server. The server also looks up the table to determine the function that the client needs to call, and then executes the corresponding function code.

Serialization and deserialization. How does the client pass parameter values to remote functions? In a native call, we simply push the arguments onto the stack and let the function read them on the stack itself. However, in remote procedure calls, the client and server are separate processes and cannot pass parameters through memory. Sometimes the client and server are not even using the same language (C++ for the server and Java or Python for the client). At this time, the client needs to first convert the parameters into a byte stream, pass it to the server, and then convert the byte stream into a format that it can read. This process is called serialization and deserialization. Similarly, the value returned from the server also needs to be serialized and deserialized.

Network transmission. Remote calls are often used over the network, where the client and server are connected. All data needs to be transmitted over the network, so there needs to be a network transport layer. The network transport layer needs to pass the Call ID and serialized parameter byte stream to the server, and then pass the serialized call result back to the client. As long as you can do both, you can use them as transport layers. Therefore, the protocol it uses is actually unlimited, as long as it can complete the transmission. Although most RPC frameworks use TCP, UDP works as well, and gRPC simply uses HTTP2. Java Netty also belongs to this layer of things.

Therefore, to implement an RPC framework, in fact, only need to implement the above three points on the basic completion. Call ID mapping can use function strings directly or integer IDs. A map is usually a hash table. Serialization and deserialization can be written by yourself, or you can use something like Protobuf or FlatBuffers. Network transport libraries can write their own sockets, or use asio, ZeroMQ, Netty and the like.

2. Typical RPC invocation framework

There are many RPC implementations and invocation frameworks, and a brief introduction to several of them is typical:

RMI (Remote Management Interface), implemented using java.rmi package, based on Java Remote Method Protocol and java native serialization.

Hessian is a lightweight remoting onhttp tool that provides RMI functionality in a simple way. Based on HTTP protocol, binary codec is adopted.

protobuf-rpc-pro is a Java library that provides a framework for remote method calls based on Google's Protocol Buffers protocol. Based on Netty's underlying NIO technology. Support TCP reuse/ keep-alive, SSL encryption, RPC call cancellation, embedded logging and other functions.

Thrift is a scalable software framework for cross-language services. It has a powerful code generation engine that seamlessly supports C++, C#, Java, Python, PHP and Ruby. Thrift allows you to define a description file that describes data types and service interfaces. From this file, the compiler conveniently generates RPC client and server communication code. Originally developed by Facebook for RPC communication between languages within the system, it was contributed to the Apache Foundation by Facebook in 2007 and is now one of the opensources under Apache. Support RPC communication between multiple languages: php client can construct an object, call the corresponding service method to invoke java language services, cross-language C/S RPC calls. The underlying communication is based on SOCKET.

Avro, from Doug Cutting, the father of Hadoop, launched at a time when Thrift was already quite popular. Avro's goal was not only to provide a set of communication middleware similar to Thrift, but also to establish a new, standard cloud computing data exchange and storage Protocol. HTTP and TCP are supported.

3. Introduction to Thrift Framework

The most common RPC tool is Facebook's open-source Thrift RPC framework.

Thrift is a cross-language service deployment framework originally developed by Facebook in 2007 and entered the Apache open source project in 2008. Thrift uses an intermediate language (IDL, Interface Definition Language) to define RPC interfaces and data types, and then generates code in different languages (currently supported by C++,Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, Smalltalk, and OCaml) through a compiler, and the generated code is responsible for the implementation of RPC protocol layer and transport layer.

Thrift actually implements C/S mode, and generates server-side and client-side code (which can be in different languages) from interface definition files through code generation tools, so as to achieve cross-language support between server and client. Users declare their own services in the Thirft description file, which will generate code files in the corresponding language after compilation, and then users implement the services (client calls services, server provides services). Protocol (protocol layer, defining data transmission format, which can be binary or XML, etc.) and transport (transport layer, defining data transmission mode, which can be TCP/IP transmission, memory sharing or file sharing, etc.) are used as runtime libraries.

Thrift's protocol stack is shown below:

retry

At the top level of Client and Server are user-defined processing logic, that is, users only need to write user logic to complete the entire RPC call process. The next layer of user logic is Thrift auto-generated code, which is mainly used for parsing, sending and receiving structured data, while the server-side auto-generated code also contains RPC request forwarding (Client A call is forwarded to Server A function for processing).

The other modules of the stack are Thrift runtime modules:

The bottom IO module is responsible for the actual data transmission, including sockets, files, or compressed data streams.

TTransport is responsible for sending and receiving messages in byte stream mode, which is the implementation of the underlying IO module in the Thrift framework. Each underlying IO module has a corresponding TTransport responsible for the transmission of byte stream data of Thrift on the IO module. For example, TSocket corresponds to Socket transmission, and TFileTransport corresponds to file transmission.

TProtocol is mainly responsible for assembling structured data into Messages or reading structured data from Message structures. TProtocol converts a typed data into a byte stream for transmission to TTransport, or reads a certain length of byte data from TTransport into a specific type of data. For example, int32 will be encoded as a four-byte data by TBinaryProtocol, or TBinaryProtocol will extract a four-byte data from TTransport and Decode int32.

TServer is responsible for receiving client requests and forwarding them to Processor for processing. The main task of TServer is to accept Client requests efficiently, especially in the case of high concurrent requests.

Processor(or TProcessor) is responsible for responding to client requests, including RPC request forwarding, invoking parameter parsing and user logic calls, and returning values to write back. Processor is the key process for moving from Thrift framework to user logic on the server side. Processor is also responsible for writing or reading data into Message structure.

Thrift module design is very good, at each level can choose the appropriate implementation according to their own needs. It should also be noted that Thrift's current features are not supported in all programming languages. For example, C++ implementation has TDenseProtocol without TTupleProtocol, while Java implementation has TTupleProtocol without TDenseProtocol.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.