Core configuration and Conceptual carding of Cassandra Cluster 07/19 Update SLTechnology News&Howtos

Core configuration and Conceptual carding of Cassandra Cluster

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

Cassandra is a distributed structured data storage scheme (NoSql database). Its storage structure is richer than Key-Value database (like Redis), but its support is limited than Document database (such as Mongodb). It is suitable for applications such as data analysis or data warehouse, which need to find quickly and have a large amount of data.

Cassandra cluster features are rich, and there are many scenarios to consider. If you want to make good use of the cluster, you should be able to understand many concepts of the cluster. Here is a brief introduction to the related concepts.

Concepts related to relational database: keyspace-> table-> column, corresponding to relational database database-> table-> column cluster main configuration: cluster_name: cluster name, multiple nodes in the same cluster, cluster name should be the same; seeds: seed node, ip of all machines in the cluster, separated by commas Storage_port: the port number of the connection between the Cassandra server and the server, which generally does not need to be modified, but to ensure that there is no firewall on this port; listen_address: the address where the server and the server communicate with each other in the Cassandra cluster. If left blank, the server's machine name is used by default; native_transport_port: the default CQL local service port, the port where the local cql client interacts with the server; and the default Cassandra uses port 7000 as the cluster communication port (port 7001 if SSL is enabled). Port 9042 is used for client connections to the native protocol. Port 7199 is used for JMX,9160 port for obsolete Thrift interface cluster main configuration file directory data_file_directories: data file storage directory, one or more commitlog_directory: log file storage directory for submission information saved_caches_directory: cache storage directory cluster related concepts: Data Center has a logical collection of multiple Rack. Compared to a logical set of Rack machines connected to each other in a building, there are multiple components that are adjacent to each other's node. For example, all the physical machines on a rack. Gossip and Failure DetectionGossip is a P2P protocol used in failure detection to track the status of other nodes and run once per second. Use Phi Accrual Failure Detection to realize failure detection to calculate a result level of suspicion, which indicates the possibility of node failure. It is flexible and avoids the unreliability of traditional heartbeat. Because it may only be temporary network congestion, especially in the public cloud. Snitchessnitch defines the proximity of each node in the cluster relative to other nodes to determine which node to read and write from. Generally, the manual mode is adopted. When the cassandra.yaml is configured as endpoint_snitch: GossipingPropertyFileSnitch, both the dc and rack of the current node are configured in the cassandra-rackdc.properties. For example, Rings and TokensCassandra represents the data managed by the cluster as a ring. Each node in the ring is assigned one or more data ranges described by token to determine its position in the ring. Token is a 64-bit integer ID used to identify each partition, the range is-2 ^ 63-- 2 ^ 63-1 calculates the hash value of the partition key through the hash algorithm to determine which node Virtual Nodesvirtual node is stored in the concept of vnode, the original token range is reduced to a number of smaller token ranges. Each node contains multiple token ranges. By default, each node produces 256 token ranges (adjusted by num_tokens), that is, 256 vnode. It is on by default after 2.0. On nodes with poor performance, the value of num_tokens can be reduced appropriately. Partitionerspartitioners determines which vnode the data is stored on. It is a hash function that calculates the hash value of each line's partition key. The code is in the org.apache.cassandra.dht package. At present, the main use of Murmur3Partitioner, DHT is distributed hash table. The first copy of Replication Strategies exists in the corresponding vnode. The location of other replicas is determined by replica strategy (or replica placement strategy). There are two main strategies: SimpleStrategy places replicas at consecutive nodes on the ring, starting with the nodes indicated by the divider. NetworkTopologyStrategy allows you to specify a different replication factor for each data center. In the data center, it allocates replicas to different rack to maximize availability Consistency Levels according to CAP theory, consistency, availability and partition tolerance cannot be achieved at the same time. Cassandra achieves adjustable consistency by setting the minimum number of response nodes for reading and writing. Optional consistency levels: ANY, ONE, TWO,THREE, QUORUM,ALL, where QUORUM,ALL is strong consistency. Strong consistency formula: read W > N R: read replication, W: write replication, N: replication factor Queries and Coordinator Nodes can connect any node to perform read and write operations. The connected node is called Coordinator Nodes, which needs to deal with read-write consistency. For example: write to multiple nodes, read the write operation from multiple nodes when performing a write operation, the data is written directly to the commit log file, and the dirty flag in commit log will be set to 1. Then write the data to memory memtable, each memtable corresponds to a table, when the size of memtable reaches a limit, it will be written to disk SSTable, and then set the dirty flag in commit log to 0Caching there are three cache:key cache cache partiton keys to row index entries mapping, there are jvm heaprow cache cache commonly used row, there is off heapcounter cache to improve counter performance Hinted Hando, a write high availability feature, when a write request is sent to coordinator The replica node may be unavailable for various reasons (network, hardware, etc.). At this time, coordinator will temporarily save the write request and wait until the replica node comes back online before writing. Keeping the TombstonesSStables file for two hours by default is unmodifiable. The deleted data is treated as a update and is updated to tombstone. It suppresses the original value before compact runs. Setting: Garbage Collection GraceSeconds (GCGraceSeconds). The default is 864, 000 and 10 days. Will clean up the tombstones beyond this time. When the node is unavailable for longer than this time, it will be replaced. Bloom Filters is a fast, non-deterministic algorithm for determining whether the test element is in the collection. This reduces unnecessary disk reads. You may get a false-positive result. By mapping datasets to bit array, a special cache. CompactionSSTables is immutable through compaction. There are three strategies to regenerate a new SSTable file (this file does not contain unwanted data, such as deleted data): SizeTieredCompactionStrategy (STCS) default policy, write-intensive LeveledCompactionStrategy (LCS) read-intensive DateTieredCompactionStrategy (DTCS) for time-or date-based data Anti-Entropy, and Repairassandra uses Anti-Entropy protocol, which is a gossip protocol used to repair replication set data. There are two situations in which read repair reads data that is not up-to-date. At this point, repair Anti-Entropy repair manually runs through nodetool repair Merkle TreesMerkle Trees comes from Ralph Merkle, also known as hash tree, is a kind of binary tree. Each parent node is the hash value of its direct child node, which is used to reduce the network Imax O. Staged Event-Driven Architecture (SEDA) cassandra adopts a phased event-driven architecture, SEDA: An Architecture for Well-Conditioned, Scalable Internet Services A stage consists of an event queue, an event processor and a thread pool controller determines the scheduling and thread application of the stage. The main code below org.apache.cassandra.concurrent.StageManager is Read (local reads) Mutation (local writes) GossipRequest/response (interactions with other nodes) Anti-entropy (nodetool repair) Read repairMigration (making schema changes) Hinted handoffSystem Keyspacessystem_tracessystem_schemakeyspacestablescolumns storage kespace,table, which is executed as stage. Definition of column-materialized_views stores available view- functions user-defined functions-types user-defined types-trigger configuration of triggers per table-aggregates aggregation defines system_authsystemlocalpeers storage node information available_rangesrange_xfers storage token range materialized_views _ builds_in_progresbuilt_materialized_views tracking view build paxos storage paxos state batchlog storage estimated number of state size_estimates storage for each table atomic batch operation for hadoop integration reference documentation:

Https://www.2cto.com/database/201802/717564.html

Https://blog.csdn.net/zhuwinmin/article/details/76066642

Https://segmentfault.com/a/1190000015610357

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.