In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly explains "what are the characteristics of Apache Ignite". The content of the explanation is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "what are the characteristics of Apache Ignite"?
1. Transaction and analysis 1.1. Data grid
The Ignite in-memory data Grid is an in-memory key store that can cache data in the memory of a distributed cluster. It reduces the noise of redundant data through strong semantic data location and relational data routing, so that the number of nodes can be increased linearly up to hundreds of nodes. The Ignite data grid is fast enough, and after continuous official testing, it is currently one of the fastest implementations to support transactional or atomic data in distributed clusters.
Feature list
1.1.1. Key value storage
Ignite data grid is an in-memory key storage, which can be regarded as a distributed partitioned hash. Each node in the cluster holds a part of all data, which means that the more nodes in the cluster, the more data can be cached. Unlike other key stores, Ignite uses pluggable hashing algorithms to determine the location of the data, and each client can insert a custom hash function to determine which node a key belongs to, without the need for any special mapping services or named nodes.
1.1.2.JCache (JSR107)
Ignite is 100% compliant with the JCache (JSR107) specification, and JCache provides a very simple but powerful API for data caching. Part of the API of JCache includes:
Basic caching operation
ConcurrentMap APIs
Collocation processing (EntryProcessor)
Events and metrics
Pluggable persistence device
1.1.3. Partitioning and replication
According to configuration, Ignite can partition or copy the data in memory. Unlike REPLICATED mode, the data is fully replicated in all nodes in the cluster. In PARTITIONED mode, the data is split evenly among multiple nodes in the cluster, allowing TB-level data to be cached in memory. Ignite can also be configured to have multiple copies to ensure data resilience in the event of a failure.
1.1.4. Collocation processing
Ignite can execute any native Java, C++ and .net / C # code in a juxtaposed manner on the server side close to the data.
1.1.5. Self-repairing cluster
The Ignite cluster can repair itself, the client will reconnect automatically when the failure occurs, the slow client will be automatically kicked out, and the data of the failed node will be automatically propagated to other nodes in the grid.
1.1.6. Client near cache
A near cache is a cache on the local client side that stores the most recently and most frequently accessed data.
1.1.7. In-and out-of-heap memory
Ignite supports two modes of data caching in memory, in-heap and out-of-heap. When cached data takes up a large heap and exceeds the Java main heap space, out-of-heap storage can overcome the long pause caused by JVM garbage collection (gc), but the data is still in memory.
1.1.8. Off-heap index
When configured for out-of-heap storage, Ignite also stores query indexes outside the heap, which means that indexes do not take up any in-heap memory.
1.1.9. Hierarchical storage
If the data gets cold (not accessed), Ignite selectively moves it from within the heap to outside the heap, and then from outside the heap to swap storage (disk). When some data is accessed again, it is immediately propagated to the top tier and other cold data is reduced to a lower-level storage tier.
1.1.10. Binary protocol
Starting with version 1.5, Ignite introduced a new concept of storing cached data, called binary objects, which can:
Read the properties of a serialized object without deserializing the entire object
Dynamically change the structure of an object
Dynamically create the structure of an object.
1.1.11.ACID transaction
Ignite provides a fully ACID-compliant distributed transaction to ensure consistency. Ignite supports optimistic and pessimistic concurrency models and isolation levels of read commit, replicable read, and serialization. Ignite's transactions use the two-phase commit protocol, and a lot of first-phase commit optimizations are made appropriately.
1.1.12. Deadlock-free transaction
Ignite supports optimistic transactions without deadlocks, he doesn't get any locks, and users don't have to worry about the order of locks, which also provides better performance.
1.1.13. Transactional EntryProcessor
Ignite's transactional EntryProcessor can execute parallelized logic as a transaction on the server side.
1.1.14. Cross-partition transaction
Ignite can perform transactions in all partitions of a cache throughout the cluster.
1.1.15. Lock
Ignite allows developers to define explicit locks to force mutexes of cached objects.
1.1.16.SQL query
Ignite supports querying the cache using the standard SQL syntax (ANSI 99), and you can use any SQL function, including aggregation and grouping.
1.1.17. Distributed association
Ignite supports distributed SQL associations and cross-cache associations.
1.1.18. Continuous inquiry
Continuous queries are useful when you execute a query and want to continuously get the data update notification of the previous query results.
1.1.19. Query index
For SQL queries, Ignite supports in-memory indexes, so all data retrieval is very fast.
1.1.20. Query consistency
Ignite supports complete query consistency, and updates after the query starts execution will not affect the query results.
1.1.21. Query fault tolerance
The query of Ignite is fault-tolerant, that is, the query results are always consistent and will not be affected by changes in the cluster topology.
1.1.22.JDBC driver
Ignite provides a JDBC driver, and you can use standard SQL queries and JDBC API to obtain distributed data in the cache.
1.1.23.ODBC driver
Ignite's ODBC driver can use standard SQL queries and ODBC API to get data from the cache.
1.1.24. General writing
Write mode allows you to update data in the database.
1.1.25. Read through
Read-through mode allows data to be read from the database.
1.1.26. Post-write cache
Ignite provides an option to perform database updates asynchronously through post-write caching.
1.1.27. Automatic persistence
Automatically connect to the underlying database and generate the object-relational mapping configuration of XML and the Java domain model POJO.
1.1.28. Database integration
Ignite can automatically integrate with external databases, including RDBMS, NoSQL, and HDFS.
1.1.29.Web Session clustering
The Ignite data grid can cache the Web Session of any application server that supports the Java Servlet3.0 specification, including Apache Tomcat,Eclipse Jetty,Oracle WebLogic and others. Caching Web Session when running an application server cluster is useful for improving the performance and scalability of the Servlet container.
1.1.30.Hibernate second-level cache
Ignite can be used as a secondary cache (or L2Cache) for Hibernate, which can significantly improve the speed of the persistence layer in the application.
1.1.31.Spring caching
Ignite supports Java method caching based on Spring annotations so that the execution results of a method can be cached in the Ignite cache. Later, if the same method is called through the same parameter set, the result will be fetched directly from the cache rather than the actual execution of the method.
1.1.32.C#/.NET
Ignite.NET is built on top of Ignite and can perform almost all in-memory data grid operations, including ACID transactions, SQL queries, distributed associations, messages, and events.
1.1.33 Centrum +
Ignite.C++ is built on top of Ignite and can perform almost all in-memory data grid operations, including SQL queries, distributed associations, and so on.
1.1.34.XA/JTA
Ignite can be configured as a Java transaction API (JTA) transaction manager search class.
1.1.35.OSGi support
Ignite provides support for OSGi.
1.2. Computing grid
Distributed computing achieves higher performance, lower latency and linear scalability through parallel processing. Ignite Computing Grid provides a set of simple API to allow users to perform distributed computing and data processing on multiple computers in the cluster. Distributed parallel computing is based on the ability to perform any calculation in the nodes in the cluster and then return the results.
Feature list
1.2.1. Distributed closure execution
The Ignite computing grid can broadcast and load balance any closure within the cluster, including Java8 lambda, as well as pure Java Runnables and Callables.
1.2.2.ForkJoin processing
ComputeTask is Ignite's abstraction of the in-memory ForkJoin paradigm and a lightweight form of MapReduce. Pure MapReduce is not built for performance, but is suitable for batch processing of offline data (such as Hadoop MapReduce). However, when computing data that resides in memory, real-time, low latency and high throughput usually have a high priority, and it is also important to simplify API. Based on these considerations, Ignite provides ComputeTask API, which is the ForkJoin implementation of Ignite (lightweight MapReduce).
1.2.3. Clustered ExecutorService
Ignite provides a cluster implementation of ExecutorService in the standard JDK, which automatically performs all calculations in a load-balanced mode within the cluster. Computing can also be fault-tolerant and can be executed as long as there is one node, which you can think of as a clustered, distributed thread pool.
1.2.4. Juxtaposition of calculation and data
The juxtaposition of computing and data can minimize data serialization in the network, and can significantly improve the performance and scalability of applications. Whenever possible, try to juxtapose the pending data and calculations cached in the cluster nodes. As needed, Ignite can provide a variety of ways to automatically or manually juxtaposition calculation and data.
1.2.5. Fault tolerance
Ignite supports automatic job failover. If a node crashes or other errors occur, the job is automatically transferred to another available node for re-execution. The pluggable FailoverSpi is responsible for the selection of new nodes when performing a failure task. At least once guarantee: Ignite guarantees that as long as a node exists, the task will not be lost.
1.2.6. Load balancing
The load balancing component is responsible for balancing the distributed tasks of each node in the cluster. Load balancing in Ignite is achieved through pluggable LoadBalancingSpi, which controls the load of all nodes in the cluster and ensures that the load of each node in the cluster is balanced. For homogenization tasks in a homogenized environment, load balancing is achieved through random or cyclic strategies. However, in many other scenarios, especially when the load is unbalanced, he provides a lot of more complex adaptive load balancing strategies.
1.2.7. Job checkpoint
Checkpoints are implemented through pluggable CheckpointSpi, which provides a function to save the intermediate state of the job. Checkpoints are very useful for long-term tasks that need to save some intermediate state to prevent node failure. When a failed node restarts, a job can load a saved checkpoint and continue execution from the fault.
1.2.8. Job scheduling
Pluggable CollisionSpi provides fine-grained control over how jobs to be executed are scheduled when they arrive at a node. He offers a lot of strategies, including FIFO, priorities and even fooling around.
1.3. Streaming Computing and CEP
Ignite streaming computing allows continuous and uninterrupted data flow to be processed in a scalable and fault-tolerant manner. In a medium-sized cluster, the proportion of data injected into Ignite can be high, easily reaching a scale of millions per second. Mode of work:
The client injects streaming data into Ignite
Data is automatically partitioned in Ignite data nodes
Concurrent processing of data in sliding window
The client executes concurrent SQL queries in streaming data
The client subscribes to a continuous query for changes in data.
Feature list
1.3.1. Data flow processor
The data flow processor is defined by IgniteDataStreamer API, which is built to inject a large number of persistent data streams into the Ignite stream cache. The data flow processor provides at least one guarantee for all data streams to inject ignite in a scalable and fault-tolerant manner.
1.3.2. Collocation processing
When you need to perform custom business logic rather than just adding new data, you need to take advantage of StreamRecerver API. The stream receiver allows data streams to be processed in parallel directly on the cached node, modifying the data or adding any custom preprocessing logic before the data enters the cache.
1.3.3. Sliding window
The Ignite streaming feature also allows queries to be made within the data sliding window. Sliding windows are configured as Ignite cache exit strategies, which can be time-based, size-based, or batch-based, and can be configured to cache one data window. However, if you need to have different sliding windows for the same data, you can easily define more than one cache for the same data.
1.3.4. Sliding window query
You can use all the Ignite data indexing capabilities, plus Ignite SQL, TEXT, and predicate-based cached queries to query in the data stream.
1.3.5. Continuous inquiry
Continuous queries are useful when you execute a query and want to continuously get the data update notification of the previous query results.
1.3.6.JMS data flow processor
Ignite's JMS data flow processor can consume messages from the JMS proxy and insert them into the Ignite cache.
1.3.7.Apache Flume Sink
IgniteSink is a Flume pool that extracts events from an associated Flume channel and injects them into the Ignite cache.
1.3.8.MQTT stream processor
Ignite's MQTT stream processor consumes messages from a MQTT topic and then provides a transformed key-value pair to the Ignite cache.
1.3.9.Twitter stream processor
Ignite's Twitter stream processor consumes messages from a Twitter stream API and then injects them into the Ignite cache.
1.3.10.Apache Kafka stream processor
Ignite's Kafka data flow processor consumes messages from a given Kafka topic in a Kafka proxy and then inserts them into the Ignite cache.
1.3.11.Apache Camel stream processor
Ignite's Camel streamer consumes messages from an Apache Camel consumer endpoint and then injects them into the Ignite cache.
1.3.12.Apache Storm stream processor
Ignite's Storm streamer consumes messages from an Apache Storm consumer endpoint and then injects them into the Ignite cache.
1.4. Distributed data structure
Ignite supports most of the data structures based on the java.util.concurrent framework in a distributed form. For example, you can add something on one node using java.util.concurrent.BlockingQeque, and then get it on another node. Or there is a distributed ID generator that guarantees the uniqueness of ID on all nodes.
Supported data structures include:
Concurrent Map (Cache)
Distributed queues and collections
AtomicLong
AtomicReference
AtomicSequence (ID generator)
CountDownLatch
ExecutorService
Feature list
1.4.1. Queues and collections
Ignite provides a fast implementation of distributed blocking queues and distributed collections.
1.4.2. Juxtaposition and non-juxtaposition
Queues and collections can be deployed juxtaposed or non-collocated. In juxtaposition mode, all elements in the collection reside on the same cluster node. In this mode, relatively small collections should be used. In non-juxtaposition mode, the elements of the collection are evenly distributed in the cluster, which allows large collections to be kept in memory.
1.4.3. Bounded queue
Bounded queues allow users to hold a queue with a predefined maximum capacity, which will help control the capacity of the entire cache.
1.4.4. Atomization type
Ignite supports distributed AtomicLong and AtomicReference.
1.4.5.CountDownLatch
Ignite's CountDownLatch can synchronize jobs on all Ignite nodes.
1.4.6. Reservation-based ID generator
The ID generator is implemented through AtomicSequence, and when incrementAndGet () (or any other atomic operation) is performed as an atomic sequence, the data structure holds a range of future values, which ensures uniqueness of instances of that sequence across the cluster.
1.4.7.Semaphore
The implementation and behavior of Ignite's distributed semaphore is similar to * * java.util.concurrent.Semaphore**.
1.5. Distributed messages and events
Ignite provides cluster-wide high-performance messaging capabilities that support data exchange based on publish-subscribe and direct point-to-point communication models. Messages can be exchanged in an orderly or disorderly manner. Ordered messages are slightly slower, but if used, Ignite ensures that the order in which messages are received is the same as the order in which they are sent. When various events occur in a distributed grid environment, the distributed event function of Ignite enables applications to be notified. You can automatically receive notifications of task execution, read, write, and query operations that occur on local and remote nodes in the cluster, or event notifications can be grouped together and sent in batches or periodically.
Feature list
1.5.1. Topic-based messa
Ignite's distributed messages enable all nodes to communicate cluster-wide based on topics.
1.5.2. Peer-to-peer message
Ignite messages can be sent to a group of nodes or to a single node.
1.5.3. Order and disorder
Ignite supports ordered and unordered messages, which are slightly slower, but if you use it, Ignite ensures that messages are received in the same order as they are sent.
1.5.4. Event notification
When various events occur in the cluster, the distributed event function of Ignite enables the application to be notified.
1.5.5. Local and remote events
Applications can receive notifications about task execution, read and write, and query operations on local and remote nodes in the cluster.
1.5.6. Automatic batch processing
Event notifications can be grouped together and sent in batches or periodically.
1.6. Service grid
The service grid can deploy any custom service in the cluster, and can implement and deploy any service, such as custom counter, ID generator, hierarchical mapping and so on. The main application scenario of service grid is to provide the ability to deploy a variety of singleton services in a cluster. However, if multiple instances of a service are required, Ignite can also guarantee the correct deployment and fault tolerance of all service instances.
Feature list
1.6.1. User-defined service
Users can define their own services and Ignite automatically distributes the services within the cluster. For example, you can create your own specific distributed counters, or custom data loading services, or any other logic, and then deploy it to the cluster.
1.6.2. Cluster singleton
Ignite allows any number of services to be deployed to any grid node. However, the most common feature is to deploy singleton services in a cluster. Ignite maintains singleton whether it is a topology change or a node failure.
1.6.3. Fault tolerance
Ignite ensures that the service is continuously valid and deployed in a specified configuration, whether it is a topology change or a node failure.
1.6.4. Load balancing
In all cases, not just singleton service deployment, Ignite automatically ensures that approximately the same number of services are deployed on each node in the cluster. When the cluster topology changes, Ignite reevaluates the deployed services, and then may redeploy the deployed services on other nodes to ensure better load balancing.
1.7. Automate RDMS integration
Ignite supports integration with various persistent stores, which can connect to databases, import schemas, configure index types, and automatically generate all necessary XML OR mapping configurations and Java domain model POJO, which can be easily downloaded and copied into your own project. Ignite can be integrated with any relational database that supports JDBC drivers, including Oracle, PostgreSQL, MS SQL Server, and MySQL.
1.7.1.RDBMS Integration Wizard
RDBMS integration can be automated through the IgniteWeb console, which is an interactive configuration wizard, management, and monitoring tool that can:
Create and download various configurations for Ignite clusters
Automatically build SQL metadata for Ignite from any RDBMS schema
Execute SQL queries in the in-memory cache
View the execution plan, memory mode, and streaming chart of the query.
The Ignite Web console is an innovative tool that provides a wealth of features to manage Ignite clusters, not limited to the functions mentioned above.
2.Hadoop and Spark2.1.Spark share RDD
Apache Ignite provides an implementation of Spark RDD abstraction that allows you to easily share state in memory across multiple Spark jobs, whether within the same application or between different Spark applications. IgniteRDD, as a view of Ignite distributed cache, can be deployed either in the Spark job execution process, in Spark workder, or in its own cluster. According to the preconfigured deployment model, state sharing can exist either within the life cycle of a Spark application (embedded mode) or outside of a Spark application (independent mode), in which state can be shared among multiple Spark applications.
Feature list
2.1.1. Share Spark RDD
IgniteRDD is an implementation of native Spark RDD and DataFrame API, sharing the state of RDD across other Spark jobs, applications, and worker, in addition to all standard RDD functions.
2.1.2. Faster SQL
Spark does not support SQL indexes, but Ignite can. Due to advanced in-memory indexing capabilities, IgniteRDD has a hundredfold performance improvement over Spark native RDD or DataFrame when performing SQL queries.
2.2. Memory file system
One of the unique technologies of Ignite is the distributed memory file system called Ignite File system (IGFS). IGFS provides functions similar to Hadoop HDFS, but only in memory. In fact, in addition to his own API,IGFS implements Hadoop's file system API, and can transparently add Hadoop or Spark applications. IGFS splits the data in each file into separate blocks and stores them in a distributed memory cache. Unlike Hadoop HDFS, however, IGFS does not need a name node and uses a hash function to automatically locate the file data. IGFS can be deployed independently or on top of HDFS, and in either case it is a transparent cache layer for files stored in HDFS. Tachyon replacement IGFS can transparently replace the Tachyon file system in the Spark environment, and since IGFS is based on the proven Ignite data grid technology, it will have better read and write performance and more stability than Tachyon. Hadoop file system if you plan to use IGFS as your Hadoop file system, you can refer to the Hadoop integration documentation, where IGFS is no different from HDFS.
Feature list
2.2.1. In-and out-of-reactor
IGFS can store files both inside and outside the heap. The key to taking up more storage space is to use the external heap to avoid pauses caused by long JVM garbage collection.
2.2.2.IGFS as a Hadoop file system
IGFS implements Hadoop's FileSystem API and can be deployed as a native Hadoop file system, just like HDFS, so that IGFS can be natively deployed in Hadoop or Spark environments in a plug-and-play manner.
2.2.3.Hadoop file system cache
IGFS can also be deployed as a cache layer on another Hadoop file system. At this point, if a file in IGFS changes, the update is automatically written to HDFS. In addition, if a file is read and he is not in IGFS at the time, Ignite automatically loads it from HDFS into IGFS.
2.2.4. Any Hadoop distribution
IGFS integrates with a native Apache Hadoop distribution and also supports Cloudera CDH and Hortonworks HDP.
2.3. Memory MapReduce
Apache Ignite brings a memory implementation of Hadoop MapReduce API, which has a significant performance improvement over the native Hadoop MapReduce implementation. Ignite MapReduce performs better than Hadoop because of push-based resource allocation and the juxtaposition of calculations and data within the process. In addition, because IGFS does not require a name node, when using IGFS, the Ignite MapReduce job goes directly to the IGFS data node within a link.
Feature list
2.3.1. Native Hadoop MapReduce
Ignite MapReduce is an implementation of Hadoop MapReduce API, which can natively join the existing Hadoop environment, and performance has been greatly improved.
2.3.2.Hadoop accelerator
Ignite provides a Hadoop Accelerator distribution, including IGFS and Ignite MapReduce, an environment that can easily be added to existing Hadoop environments.
3. Run 3.1 everywhere. Client protocol
Ignite provides several protocols for clients to connect to Ignite clusters, including Ignite native clients, REST/HTTP,SSL/TLS,Memcached,Node.js (under development), and so on. Feature list
3.1.1.Ignite native client
For client remote connection Ignite, the native client provides full functionality, which allows the use of complete Ignite API, including near caching, transactions, computing, streaming, services, and so on.
3.1.2.Memcached
Ignite is compatible with Memcached and allows users to save and retrieve distributed data in the Ignite cache using any Memcached-compliant client, including Java, PHP, Python, Ruby, and other Memcached clients.
3.1.3.REST/HTTP
Ignite provides a HTTP REST client that can communicate through HTTP or HTTPS protocols in an REST manner. REST API can perform different operations, such as reading / writing from / to the cache, performing tasks, getting various metrics, and so on.
3.1.4.SSL/TLS
Ignite allows Socket communication using SSL between all Ignite client and server nodes.
3.1.5.Node.js (under development)
Ignite will provide a Node.js client in the future, which can perform all caching operations and execute SQL queries in JSON data stored in Ignite.
3.2. Deployment option
Apache Ignite can run independently, in a cluster, in a Docker container, and in Apache Mesos and Hadoop Yarn environments. It can run on a physical machine or a virtual machine. Public cloud for public cloud environment, Ignite inherits Amazon AWS and Google Compute Engine natively. For other cloud environments, Ignite integrates Apache JCloud, and supports most cloud providers. Container Ignite can easily run in a container environment, and Ignite integrates Docker to automatically build and deploy user code into Ignite before the server starts. Resource management Ignite natively integrates Hadoop Yarn and Apache Mesos, making it easy to deploy Ignite seamlessly into Hadoop and Spark environments.
Feature list
3.2.1. Zero deployment
Ignite nodes automatically perceive custom classes without the need to explicitly deploy them.
3.2.2. Dynamic mode change
Ignite stores objects in a binary manner without the need to deploy classes on server nodes.
3.2.3. Independent cluster
Ignite nodes are automatically aware of each other, which helps to expand the cluster if necessary without the need to restart the entire cluster, simply start the new nodes and then they automatically join the cluster.
3.2.4.Docker container
Docker can package Ignite and all its dependencies into a standard image. After Docker automatically downloads the Ignite version image, you can deploy the user's application to Ignite, configure the node, and he will automatically start the entire configured Ignite node.
3.2.5. Public cloud
For public cloud environments, Ignite inherits Amazon AWS and Google Compute Engine natively, while for other cloud environments, Ignite integrates Apache JCloud, which supports most cloud providers.
3.2.6.Apache Mesos
Ignite provides native support for Apache Mesos, making it easy to deploy Ignite to Mesos data centers, such as Hadoop and Spark environments.
3.2.7.Hadoop Yarn
Ignite provides native support for Hadoop Yarn, and Ignite can be easily deployed into Hadoop and Spark environments.
Thank you for your reading, the above is the content of "what are the characteristics of Apache Ignite", after the study of this article, I believe you have a deeper understanding of the characteristics of Apache Ignite, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.