What's the use of Apache Ignite? 04/20 Update SLTechnology News&Howtos

What's the use of Apache Ignite?

2025-04-20 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article mainly introduces "what is the use of Apache Ignite". In daily operation, I believe many people have doubts about the use of Apache Ignite. The editor consulted all kinds of materials and sorted out simple and easy-to-use methods of operation. I hope it will be helpful for you to answer the doubts about "what is the use of Apache Ignite?" Next, please follow the editor to study!

1. Traditional scheme

In a variety of industries, batch business processing is a routine requirement, which is very common. It is characterized by offline processing, long running time and intensive computing. The traditional solution is either to use multithreading technology, or to use database computing, such as calling the stored procedure technology of the database and so on. After the emergence of distributed computing technology led by Hadoop, the situation has changed a lot. MapReduce paradigm provides a new idea for large-scale offline data processing, performance has been greatly improved, but also provides a good linear expansion solution.

two。 The problems faced by

The common defect of multithreading or technologies such as stored procedures is that the scalability is poor, the performance depends on a single hardware performance, and it is difficult to greatly improve the performance, so it is impossible to achieve distributed computing. The big data processing solution, led by Hadoop, has developed rapidly recently, and the performance index has been constantly improving, but the design goal, or applicable scenario, is mainly in the analysis business of large-scale unstructured data on the Internet, although it can also be used for traditional batch business processing, but on the one hand, batch business processing does not need so many functions. On the other hand, these new generation computing platforms are heterogeneous systems and need to be deployed separately from specific applications. If you want to achieve high availability, the overall architecture will become very complex, and the overall operation and maintenance costs will also rise. After adding the corresponding servers, if there is not much computing, the resource utilization will also decline. The input-output ratio of using this technology needs to be considered.

3.Ignite computing grid

Ignite Computing Grid implements distributed closures and ExecutorService, and it also provides a lightweight MapReduce (or ForkJoin) implementation. This article focuses on lightweight MapReduce, others can refer to the relevant manuals.

3.1.MapReduce and ForkJoin

The ComputeTask interface is an abstraction of Ignite's simplified version of memory MapReduce, and it is also very close to the ForkJoin paradigm. This interface allows fine-grained control of job-to-node mapping and customized failover strategies. If these are not needed, a simpler distributed closure implementation can be used and the code will be more refined.

3.1.1.ComputeTask

ComputeTask defines the jobs to be executed within the cluster and the mapping of these jobs to nodes, and it also defines how to handle the return value (Reduce) of the job. All the IgniteCompute.execute (...) Methods will perform the given task on the cluster, and the application only needs to implement the map (.) of the ComputeTask interface. And reduce (...) Method, where:

Map (...) Method is responsible for instantiating jobs and mapping them to work nodes, a process that can be further simplified through ComputeTaskSplitAdapter

Result (...) Method is called each time the job is executed on the cluster node, it receives the results returned by the calculation job and a list of job results received so far, and the method returns an instance of ComputeJobResultPolicy indicating what to do next

When all the assignments are completed, reduce (...) Method is called during the Reduce phase. This method receives a list of all the calculation results and returns a final calculation result.

3.1.2. Simpler adapter ComputeTaskSplitAdapter

It is not necessary to define all three methods of implementing ComputeTask at a time when computing, and development can be further simplified through the adapters provided by Ignite. I focus on ComputeTaskSplitAdapter, which adds the ability to automatically assign jobs to nodes. It hides the map (...) Method and then add a new split (...) Method, so that developers only need to provide a set of jobs to be executed, which is very suitable for batch business processing. This adapter is useful for a homogenized environment where all nodes are suitable for executing jobs, so that the mapping phase can be completed implicitly.

3.1.3.ComputeJob

All jobs triggered by a task implement the ComputeJob interface, whose execute () method defines the logic of the job and then returns the result of a job.

3.1.4. Simple exampl

The following code, as a simple example, shows how to calculate the total number of letters in a paragraph:

IgniteCompute compute = ignite.compute (); / / perform tasks on the cluster. Int cnt = grid.compute (). Execute (CharacterCountTask.class, "Hello Grid Enabled World!"); private static class CharacterCountTask extends ComputeTaskSplitAdapter {/ / 1. Split the received string into an array of strings / / 2. Create an assignment / / 3 for each word. Send each job to the worker node for processing @ Override public List split (List subgrid, String arg) {String [] words = arg.split (""); List jobs = new ArrayList (words.length) For (final String word: arg.split ("")) {jobs.add (new ComputeJobAdapter () {@ Override public Object execute () {return word.length ();}});} return jobs;} @ Override public Integer reduce (List results) {int sum = 0 For (ComputeJobResult res: results) sum + = res.getData (); return sum;}}

Isn't it very simple?

3.2. Fault tolerance

Ignite supports automatic failover of jobs. When one node fails, the job is transferred to other available nodes and executed again. The failover is achieved through FailoverSpi, and FailoverSpi is responsible for selecting a new node to execute the failed job. It checks the failed job and a list of all available grid nodes that the job can attempt to execute. It ensures that the job is not mapped to the same node that failed again. The failover occurs when ComputeTask.result (...) Triggered when the method returns a ComputeJobResultPolicy.FAILOVER policy. Ignite has built-in implementations of failover SPI, which developers can customize. In addition, Ignite guarantees that jobs will not be lost as long as one node is valid.

3.3. Load balancing

Load balancing in Ignite is achieved through LoadBalancingSpi. It controls the load of all nodes and ensures that the load of each node in the cluster is balanced. For homogenized tasks in a homogenized environment, load balancing adopts a random or cyclic strategy. However, in many other scenarios, especially under uneven loads, more complex adaptive load balancing strategies are needed. Ignite has built-in several load balancing implementations, such as cyclic load balancing RoundRobinLoadBalancingSpi and random or weighted load balancing WeightedRandomLoadBalancingSpi, which can also be customized to meet individual needs.

3.4. Job scheduling

In Ignite, the job is mapped to the cluster node during the task split initialization or closure execution phase on the client side, but once the job reaches the assigned node, it will be executed in an orderly manner. By default, jobs are submitted to a thread pool and executed randomly, and CollisionSpi needs to be enabled for fine-grained control of the order in which jobs are executed, such as sorting by FIFO or priority.

3.5. Business

In the enterprise batch business processing, the database is usually updated frequently. In the distributed computing environment, it is obviously not appropriate to configure the whole task as a transaction. The best practice is to configure each job as a transaction so that if a job fails, only the job is rolled back, other successful jobs are submitted normally, and then the failover mechanism causes the failed job to be executed again until it is successfully submitted.

3.6. Other

Ignite's in-memory MapReduce implementation also supports sessions, a mechanism that can share some data between tasks and jobs, and also supports node local state sharing, which is actually a node's local variable that can be used for tasks to share state during different execution processes. In addition, performance can be greatly improved through the juxtaposition of computing and caching data, it also supports checkpoints, and some intermediate states can be saved in a long-running job, which is a mechanism after restarting a failed node. the job can be loaded from the saved checkpoint and continue to execute from the fault. Wait, I won't introduce them one by one here.

Advantages of 4.Ignite

In my previous article on the cluster deployment of Ignite, I briefly introduced the cluster features of Ignite, which recommended a hybrid cluster deployment scheme, as shown in the following figure:

In this architecture, if you can implement batch business processing by distributed computing in an application cluster, it will be a very elegant solution. Fortunately, Ignite has really implemented it. Overall, this solution has the following advantages:

* * easy to develop: * * MapReduce is implemented in a few simple pieces of code, and the entry threshold is very low. After a short period of study, you can focus on the processing of complex business.

Easy to debug: Ignite can start a cluster with only one node on a stand-alone machine, and can directly step into debugging in IDE without the need to build any complex environment for development and debugging.

* * simple deployment: * * as long as several jar packages of Ignite are embedded in the application, the discovery mechanism of Ignite can be used to automatically build clusters and realize distributed computing, while other distributed computing platforms basically need to deploy separate computing servers, the whole deployment architecture will become complex, and the cost of operation and maintenance will rise.

* * High resource utilization: * * batch business processing is usually done when the system is idle at night. If distributed computing is carried out on these devices, you can make full use of computing resources. If you adopt a computing solution that requires separate servers, because most of the time the equipment is idle, the overall equipment cost increases, and the resource utilization efficiency decreases significantly.

At this point, the study of "what is the use of Apache Ignite" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.