How to analyze GemFire Architecture 04/27 Update SLTechnology News&Howtos

How to analyze GemFire Architecture

2025-04-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

Today I will show you how to analyze the GemFire architecture. The content of the article is good. Now I would like to share it with you. Friends who feel in need can understand it. I hope it will be helpful to you. Let's read it along with the editor's ideas.

1 what is GemFire

GemFire is a high-performance, distributed operational data (operational data) management infrastructure located between application clusters and back-end data sources. It provides low latency, high throughput data sharing and event distribution. GemFire makes full use of the memory and disk resources in the network to form a real-time data grid (data fabric or grid).

The main features of GemFire are:

Multiple network topologies

High concurrency memory data structure to avoid lock contention

The optional ACID

Serialization (native serialization) and intelligent buffering (smart buffering) ensure fast message distribution

Write disk synchronously or asynchronously

Redundant memory copy

2 Network topology and cache architecture

Considering the diversity of problems and architectural flexibility, GemFire provides a variety of options to configure where (where) and how (how) to manage cached data, which enables architects to build an appropriate caching architecture from P2P (peer-to-peer), CS (client-server) and WAN components.

2.1 P2P Topology

In P2P distributed systems, applications use the mirroring function of GemFire to partition a large amount of data across nodes (sharding) and synchronize data replication between these nodes. The following is mainly about the two main roles in GemFire's P2P topology: mirrored mirror node and partitioned partition node (see mirror-type configuration in 3.2for details).

Because caching data is associated with applications in a P2P topology, let's talk about embedded caching first. The so-called embedded cache (embedded cache) actually means that the cache and the application are together, directly using the memory space of the application server. That is, we often talk about local caching (local cache) similar to Ehcache.

Like a magnet, the mirrored node absorbs all the data from other data regions to form a complete data set. When a data area is configured as a node of mirrored to create or rebuild for the first time, GemFire will automatically perform the initial mirror capture (initial image fetch) operation to restore the complete state from the data subset of other nodes. If there is another mirrored node in the network at this time, the optimal direct fetching (optimal directed fetch) will be performed.

So it's easy to see that mirrored nodes serve two main purposes:

For applications with a large number of reads, the application saves all the data, so that the client request can instantly access the desired data without going through the network transmission.

When a failure occurs, the mirrored node can be used to restore other nodes

Unlike mirrored nodes, each partitioned node holds a unique piece of data. Applications are just like working with local data, GemFire manages the data of each partition behind the scenes and ensures that data access is done in at most one hop (at most one network hop). According to GemFire's hashing algorithm, the partition data is automatically put into the bucket of each node. At the same time, GemFire automatically allocates the location of redundant data and replicates it. When a node goes wrong, the client request is automatically redirected to the backup node. And GemFire will reproduce a copy of the data, thus ensuring the number of redundant copies of the data. Finally, we can add new nodes to the network at any time to dynamically expand the GemFire cluster.

P2P system provides low latency, single hop (one-hop) data access, dynamic discovery and transparent data storage location. However, each node in the network maintains a socket connection to every other node. When the number of nodes increases, the number of connections will increase exponentially. In order to improve the scalability, GemFire provides a reliable communication mode of UDP multicast. We will see the role of P2P data synchronization in replicating data between servers in the next section.

2.2 Client-Server Topology

Client-Server cache allows a large number of nodes to connect to form a client-server structure. The server provides caching for clients as well as data replication or caching for other servers.

2.3 WAN Topology

P2P clusters have scalability problems due to the tight coupling between points, which is magnified when data centers have multiple clusters or data centers cross cities. GemFire provides another model to solve the problem.

3 GemFire working principle 3.1 Discovery Mechanism

The default GemFire uses IP Multicast to discover new members, but all members communicate with each other using TCP. When the deployment environment prohibits the use of IP multicasting or when the network spans multiple subnets, GemFire provides an alternative: use a lightweight location server (locator server) to track the connections of all members. When new members join the cluster, they ask the location service and establish a socket-to-socket TCP connection similar to IP Multicast.

3.2 data distribution

Each member creates one or more cached data regions (data region). By dividing the regions, we can configure different distribution properties, memory management, and data consistency models for each region. The default GemFire uses a P2P distribution model, and each member can communicate with any other member. At the same time, according to the characteristics of different intranets, the transport layer can choose TCP/IP or reliable Multicast (UDP). In these configurations, two properties are important, the scope and the mirror type (mirror-type).

First, there are four options for scope:

Angular Local: not distributed. Then why not just save it to HashMap. Because GemFire provides additional features such as automatic persistence of data to disk, OQL (Object Query Language) query data, transactions for data operations, and so on.

Angular Distribute-no-ack: sends data to member 1 without waiting for a response from member 1 when sending data to member 2. It is suitable for situations where there is no high requirement for data consistency and low network latency. This is the default configuration of GemFire, which provides low latency, high throughput, and reduces the probability of data collisions by distributing it as soon as possible.

Angular Distribute-ack: send data and wait for a response from member 1 before sending it to member 2. So that each piece of data is distributed synchronously.

Angular Global: acquire locks on other members before distribution and redistribute the data. Suitable for pessimistic application scenarios, global lock service is used to manage lock acquisition, release and timeout.

Now let's take a look at the second important configuration property image type (mirror-type):

Angular none: update only if this data is in the cache, and any new data sent by other members will be ignored. Applicable to a data area that is only used to hold a subset of the data of another region.

·keys: the data area saves only key to save memory, grabs the data from other regions and saves it locally when there is a real request, and then accepts updates to this data item. It is suitable for situations where it is impossible to predict which data will be accessed by a node.

Angular keys-values: a real mirror that will hold all the data. It is suitable for nodes that need to access all data immediately, as well as redundant backup of data.

The configuration of these two properties has a significant impact on what data is stored in the data area:

4 persistence and overflow

Persistence copies the entire dataset to disk, which can be used to restore data when a member goes wrong. Overflow saves key in memory and value to disk to save memory. The two can be used either alone or mixed.

4.1 persistence

GemFire supports two write disk options: write synchronously when manipulating in-memory data, or write asynchronously at regular intervals. The latter is used only if the application can tolerate incomplete data restore in the event of an error.

4.2 overflow

When there is insufficient memory, GemFire uses the LRU policy to determine whether to overflow a data item.

4.3 mixed use

Persistence and overflow can be mixed. All key-value are backed up to disk, and when memory runs out, only recently used data is retained. Value that is removed to disk due to LRU will not affect the disk because all data has been persisted to disk.

5 transactions

GemFire supports both cache transactions and JTA transactions.

5.1 caching transaction

Each transaction has its own private work area. At the beginning of the transaction, the data is copied to the private area until the transaction commits. If there is no conflict at the time of submission, the data is copied back from the private area to the original area. This allows the transaction to modify the cache concurrently.

For cached data regions where the scope is configured as local, the transaction is completed after the commit. For scope=distributed-no-ack or distributed-ack, however, cache synchronization occurs when the transaction is committed.

The above is how to analyze the whole content of GemFire architecture, more content related to how to analyze GemFire architecture can search the previous articles or browse the following articles to learn ha! I believe the editor will add more knowledge to you. I hope you can support it!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.