Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the core features of Apache Ignite

2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly explains "what are the core features of Apache Ignite". The content of the explanation is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn what the core features of Apache Ignite are.

1. Data grid

The Ignite in-memory data Grid is an in-memory key store that can cache data in the memory of a distributed cluster. It reduces the noise of redundant data through strong semantic data location and relational data routing, so that the number of nodes can be increased linearly up to hundreds of nodes. The Ignite data grid is fast enough, and after continuous official testing, it is currently one of the fastest implementations to support transactional or atomic data in distributed clusters. Feature list

1.1. Key value storage

Ignite data grid is an in-memory key storage, which can be regarded as a distributed partitioned hash. Each node in the cluster holds a part of all data, which means that the more nodes in the cluster, the more data can be cached. Unlike other key stores, Ignite uses pluggable hashing algorithms to determine the location of the data, and each client can insert a custom hash function to determine which node a key belongs to, without the need for any special mapping services or named nodes.

1.2.JCache (JSR107)

Ignite is 100% compliant with the JCache (JSR107) specification, and JCache provides a very simple but powerful API for data caching. Part of the API of JCache includes:

Basic caching operation

ConcurrentMap APIs

Parallel processing (EntryProcessor)

Events and metrics

Pluggable persistence

1.3. Partitioning and replication

According to configuration, Ignite can partition and replicate the data in memory. Unlike REPLICATED mode, the data is fully replicated in all nodes in the cluster. In PARTITIONED mode, the data is split evenly among multiple nodes in the cluster, allowing TB-level data to be cached in memory. Ignite can also be configured to have multiple copies to ensure data resilience in the event of a failure. Regardless of which caching mode is used, Ignite ensures data consistency across all cluster nodes in any failure mode.

1.4. Self-repairing cluster

The Ignite cluster can repair itself, the client will reconnect automatically when the failure occurs, the slow client will be kicked out automatically, and the data of the failed node will be automatically propagated to other nodes in the grid.

1.5. Client near cache

When the data is accessed by the remote client, Ignite also supports the near cache on the client side. In the transaction mode, the data in the near cache is still transactional, either automatically updated or the transaction commit is invalid in a constant way.

1.6.ACID transaction

Ignite supports two modes of cache operations, transactional and atomic. In transaction mode, multiple cache operations can be transactional as a group, while atomic mode supports multiple atomic operations, one at a time. Atomic mode is lighter and usually has better performance than transaction mode caching. In transaction mode, Ignite supports optimistic transactions and pessimistic transactions, using a two-phase commit protocol optimized by one-phase commit as much as possible.

1.7. Query and distributed association

Ignite provides a very elegant query API that supports:

Scan query based on predicate

SQL query (ANSI 99)

Text query

For SQL and text queries, Ignite provides in-memory indexes, so all data queries are very fast, and if data is cached in out-of-heap memory, query indexes are also in out-of-heap memory. Ignite also allows users to customize their own indexes using pluggable IndexingSpi.

1.8. Continuous inquiry

Continuous queries are useful when you execute a query and want to continuously get the data update notification of the previous query results.

1.9. In-and out-of-heap memory

Ignite supports two modes of caching data in memory, in-heap and out-of-heap memory, which allows data to be stored outside the main Java heap space when the heap is too large to avoid pauses caused by JVM garbage collection, but the data is still in memory. As long as out-of-heap memory is configured, Ignite also stores indexes outside the heap, which means that indexes do not take up any in-heap memory space.

1.10. Hierarchical storage

When the data access rate drops, ignite will selectively migrate data from in-heap memory to out-of-heap memory, or even from out-of-heap memory to swap (disk) storage. When this part of the data is accessed again, it is immediately migrated to the top level, while other data with low access rates are migrated to the lower next memory layer.

1.11.JDBC driver

Ignite also provides a JDBC driver that allows users to obtain distributed data in the cache using standard SQL queries and JDBC API. Ignite allows users to connect to Ignite using any standard SQL tool and then start executing SQL queries in Ignite cached in-memory data.

1.12.Web Session clustering

The Ignite data grid can cache the Web Session of any application server that supports the Java Servlet3.0 specification, including Apache Tomcat,Eclipse Jetty,Oracle WebLogic and others. Caching Web Session when running an application server cluster is useful for improving the performance and scalability of the Servlet container.

1.13.Hibernate second-level cache

Ignite can be used as a secondary cache for Hibernate, which can significantly improve the speed of the persistence layer in the application.

1.14.Spring caching

Ignite supports Java method caching based on Spring annotations, so that the execution result of a method can be cached in the Ignite cache. Later, if the same method is called through the same parameter set, the result will be fetched directly from the cache rather than the actual execution of the method.

two。 Computing grid

Distributed computing achieves higher performance, lower latency and linear scalability through parallel processing. Ignite Computing Grid provides a set of simple API to allow users to perform distributed computing and data processing on multiple computers in the cluster. Distributed computing is based on the ability to do any calculation in the nodes in the cluster and then return the results. Feature list

2.1. Distributed closure execution

The Ignite computing grid allows broadcast and load balancing of any closure within the cluster, including Java8 lambda, as well as pure Java Runnables and Callables.

2.2.ForkJoin execution

ComputeTask is Ignite's abstraction and paradigm of in-memory ForkJoin, and it is also a lightweight form of MapReduce. Pure MapReduce is not built for performance, but is suitable for batch processing of offline data (such as Hadoop MapReduce). However, when computing data that resides in memory, real-time, low latency and high throughput usually have a high priority, and it is also important to simplify API. Based on these considerations, Ignite provides ComputeTask API, which is the ForkJoin implementation of Ignite (lightweight MapReduce).

2.3. Clustered ExecutorService

Ignite provides a cluster implementation of ExecutorService in the standard JDK, which automatically performs all calculations in a load-balanced mode within the cluster. Computing can also be fault-tolerant and can be executed as long as there is one node, which you can think of as a clustered, distributed thread pool.

2.4. Juxtaposition of calculation and data

The juxtaposition of computing and data can minimize data serialization in the network, and can significantly improve the performance and scalability of applications. Whenever possible, try to juxtapose the pending data and calculations cached in the cluster nodes. As needed, Ignite can provide a variety of ways to automatically or manually juxtaposition calculation and data.

2.5. Fault tolerance

Ignite supports automatic job failover. If a node crashes or other errors occur, the job is automatically transferred to another available node for re-execution. The pluggable FailoverSpi is responsible for the selection of new nodes when performing a failure task. At least once guarantee: Ignite guarantees that as long as a node exists, the task will not be lost.

2.6. Load balancing

The load balancing component is responsible for balancing the distributed tasks of each node in the cluster. Load balancing in Ignite is achieved through pluggable LoadBalancingSpi, which controls the load of all nodes in the cluster and ensures that the load of each node in the cluster is balanced. For homogenization tasks in a homogenized environment, load balancing is achieved through random or cyclic strategies. However, in many other scenarios, especially when the load is unbalanced, he provides a lot of more complex adaptive load balancing strategies.

2.7. Task checkpoint

Checkpoints are implemented through pluggable CheckpointSpi, which provides a function to save the intermediate state of the job. Checkpoints are very useful for long-term tasks that need to save some intermediate state to prevent node failure. When a failed node restarts, a job can load a saved checkpoint and continue execution from the fault.

2.8. Job scheduling

Pluggable CollisionSpi provides fine-grained control over how jobs to be executed are scheduled when they arrive at a node. He offers a lot of strategies, including FIFO, priorities and even fooling around.

3. Streaming Computing and CEP

Ignite streaming computing allows continuous, endless data streams to be processed in a scalable and fault-tolerant manner. In a medium-sized cluster, the proportion of data injected into Ignite can be high, easily reaching a scale of millions per second. Mode of work:

The client injects streaming data into Ignite

Data is automatically partitioned in Ignite data nodes

Concurrent processing of data in sliding window

The client executes concurrent SQL queries in streaming data

The client subscribes to a continuous query for changes in data.

Feature list

3.1. Data flow processor

The data flow processor is defined by IgniteDataStreamer API, which is built to inject a large number of persistent data streams into the Ignite stream cache. The data flow processor provides at least one guarantee for all data streams to inject ignite in a scalable and fault-tolerant manner.

3.2. Parallel processing

When you need to execute your own business logic rather than just adding new data, you need to take advantage of StreamRecerver API. The stream receiver allows data flow to be processed in parallel directly on the cached node, modifying the data or adding any custom preprocessing logic before the data enters the cache.

3.3. Sliding window

The Ignite streaming feature allows queries to be made within the data sliding window. Sliding window is configured as Ignite cache extraction strategy, can be time-based, size-based or batch-based, you can configure a cache for one data window, and you can easily define more than one cache for the same data if you need different sliding windows for the same data.

3.4. Sliding window query

You can use all the Ignite data indexing capabilities, plus Ignite SQL, TEXT, and predicate-based cached queries to query in the data stream.

3.5. Continuous inquiry

Continuous queries are useful when you execute a query and want to continuously get the data update notification of the previous query results.

4. Distributed data structure

Ignite supports most of the data structures based on the java.util.concurrent framework in a distributed form. For example, you can add something on one node using java.util.concurrent.BlockingQeque, and then get it on another node. Or there is a distributed ID generator that guarantees the uniqueness of ID on all nodes. Supported data structures include:

Concurrent Map (Cache)

Distributed queues and collections

AtomicLong

AtomicReference

AtomicSequence (ID Generator)

CountDownLatch

Feature list

4.1. Juxtaposition and non-juxtaposition

Queues and collections can be deployed juxtaposed or non-collocated. In juxtaposition mode, all elements in the collection reside on the same cluster node. In this mode, relatively small collections should be used. In non-juxtaposition mode, the elements of the collection are evenly distributed in the cluster, which allows large collections to be kept in memory.

4.2. Bounded queue

Bounded queues allow users to hold a queue with a predefined maximum capacity, which will help control the capacity of the entire cache.

4.3. Reservation-based ID generator

The ID generator is implemented through AtomicSequence, and when you perform incrementAndGet () (or any other atomic operation) as an atomic sequence, the data structure holds a range of future values, which ensures uniqueness of instances of that sequence across the cluster. Until all saved values are used, all sequence growth operations occur locally on the client side.

5. Distributed message

Ignite provides cluster-wide high-performance messaging capabilities and supports data exchange based on publish-subscribe and direct point-to-point communication models. Messages can be exchanged in an orderly or disorderly manner.

Feature list

5.1. Order and disorder

Ignite supports ordered and unordered messages, which are slightly slower, but if you use it, Ignite ensures that messages are received in the same order as they are sent.

5.2. Theme and peer-to-peer

Ignite messages support topic-based subscriptions, which can be sent to a set of nodes or to a node.

6. Service grid

The service grid allows users to deploy custom services in the cluster, such as custom counters, ID generators, hierarchical mappings, etc. The main application scenario of service grid is to provide the ability to deploy a variety of singleton services in a cluster. However, if you need multiple instances of a service, Ignite can also guarantee the correct deployment and fault tolerance of all service instances. Feature list

6.1. User-defined service

Users can define their own services and Ignite automatically distributes the services within the cluster. For example, you can create your own specific distributed counters, or custom data loading services, or any other logic, and then deploy it to the cluster.

6.2. Cluster singleton

Ignite allows any number of services to be deployed to each node of the network. however, the most frequently used feature is the deployment of singleton services in a cluster. Ignite maintains singleton whether it is a topology change or a node failure.

6.3. Fault tolerance

Ignite ensures that the service is continuously valid and deployed in a specified configuration, whether it is a topology change or a node failure.

6.4. Load balancing

In all cases, not just singleton service deployment, Ignite automatically ensures that approximately the same number of services are deployed on each node in the cluster. When the cluster topology changes, Ignite reevaluates the deployed services, and then may redeploy the deployed services on other nodes to ensure better load balancing.

7.Spark share RDD

Apache Ignite provides an implementation of Spark RDD abstraction that allows you to easily share state in memory across multiple Spark jobs, whether within the same application or between different Spark applications. IgniteRDD, as a view of Ignite distributed cache, can be deployed either in the Spark job execution process, in Spark workder, or in its own cluster. According to the pre-configured deployment model, state sharing can exist either within the life cycle of a Spark application (embedded mode) or outside of a Spark application (independent mode), in which state can be shared among multiple Spark applications.

Feature list

7.1. Share Spark RDD

IgniteRDD is an implementation of native Spark RDD, DataFrame API has all the functions of standard RDD, and the status of RDD can be shared among Spark jobs, applications, and worker.

7.2. Faster SQL

Spark does not support SQL indexes, but Ignite can. Due to advanced in-memory indexing capabilities, IgniteRDD has a hundredfold performance improvement over Spark native RDD or DataFrame when performing SQL queries.

8. Memory file system

One of the unique technologies of Ignite is the distributed memory File system (IGFS). IGFS provides functions similar to Hadoop HDFS, but only in memory. In fact, in addition to his own API,IGFS implements Hadoop's file system API, and can transparently add Hadoop or Spark applications. IGFS splits the data in each file into separate blocks and stores them in a distributed memory cache. Unlike Hadoop HDFS, however, IGFS does not need a name node and uses a hash function to automatically locate the file data. IGFS can be deployed independently or on top of HDFS, and in either case it is a transparent cache layer for files stored in HDFS.

Feature list

8.1. In-and out-of-reactor

IGFS can store files both inside and outside the heap, and the key to larger storage space is to use the external heap to avoid pauses caused by long JVM garbage collection.

8.2.IGFS as a Hadoop file system

IGFS implements Hadoop's FileSystem API and can be deployed as a native Hadoop file system, just like HDFS, so that IGFS can be natively deployed in Hadoop or Spark environments in a plug-and-play manner.

8.3.Hadoop file system cache

IGFS can also be deployed as a cache layer on another Hadoop file system. In this case, if a file in IGFS changes, the update is automatically written to HDFS. In addition, if a file is read and he is not in IGFS at the time, Ignite automatically loads it from HDFS into IGFS.

8.4.Hadoop distribution

IGFS integrates with a native Apache Hadoop and also supports Cloudera CDH and Hortonworks HDP.

9. Memory MapReduce

Apache Ignite brings a memory implementation of Hadoop MapReduce API, which has a significant performance improvement over the native Hadoop MapReduce implementation. Ignite MapReduce performs better than Hadoop because of push-based resource allocation and intra-process collaborative computing of data. In addition, because IGFS does not require a name node, when using IGFS, the Ignite MapReduce job goes directly to the IGFS data node within a link.

Feature list

9.1. Native Hadoop MapReduce

Ignite MapReduce is an implementation of Hadoop MapReduce API, which can natively join the existing Hadoop environment, and performance has been greatly improved.

9.2.Hadoop acceleration

Ignite provides an accelerated release of Hadoop, including IGFS and Ignite MapReduce, an environment that can easily be added to existing Hadoop environments.

10. Client protocol

Ignite provides several protocols for clients to connect to Ignite clusters, including Ignite native clients, REST/HTTP,SSL/TLS,Memcached,Node.js (under development), and so on. Feature list

10.1.Ignite native client

For client remote connection Ignite, the native client provides full functionality, which allows the use of complete Ignite API, including near caching, transactions, computing, streaming, services, and so on.

10.2.Memcached

Ignite is compatible with Memcached and allows users to save and retrieve distributed data in the Ignite cache using any Memcached-compliant client, including Java, PHP, Python, Ruby, and other clients.

10.3.REST/HTTP

Ignite provides a HTTP REST client that can communicate through HTTP or HTTPS protocols in an REST manner. REST API can perform many operations, such as reading from the cache, performing tasks, getting various metrics, and so on.

10.4.SSL/TLS

Ignite allows Socket communication using SSL between all Ignite client and server nodes.

10.5.Node.js (under development)

Ignite will provide a Node.js client in the future, which can perform all caching operations and execute SQL queries in JSON data stored in Ignite.

11. Deployment environment

Apache Ignite can run independently, in a cluster, in a Docker container, and in Apache Mesos and Hadoop Yarn environments. It can run on a physical machine or a virtual machine.

Feature list

11.1. Independent cluster

Ignite nodes automatically perceive each other, which helps the cluster to be scalable without the need to restart the cluster, simply start the new nodes and then they join the cluster automatically.

11.2.Docker container

Docker can package Ignite and all its dependencies into a standard image. After Docker downloads the Ignite version image, you can deploy the user's application to Ignite, configure the node, and he will automatically start the entire configured Ignite node.

11.3. Public cloud

For public cloud environments, Ignite natively integrates Amazon AWS and GCE, while for other cloud environments, Ignite integrates Apache JCloud, which supports most existing cloud service providers.

11.4.Apache Mesos

Ignite provides native support for Apache Mesos, making it easy to deploy Ignite to Mesos data centers, such as Hadoop and Spark environments.

11.5.Hadoop Yarn

Ignite provides native support for Hadoop Yarn, and Ignite can be easily deployed into Hadoop and Spark environments.

Thank you for your reading, these are the contents of "what are the core features of Apache Ignite". After the study of this article, I believe you have a deeper understanding of what the core features of Apache Ignite are, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report