How to analyze the Cassandra model and architecture 04/27 Update SLTechnology News&Howtos

How to analyze the Cassandra model and architecture

2025-04-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

How to carry out Cassandra model and architecture analysis, I believe that many inexperienced people do not know what to do, so this paper summarizes the causes of the problem and solutions, through this article I hope you can solve this problem.

The world of NoSql is numerous and complicated. Apart from memory-based databases such as Redis, Cassandra,Hbase and MongoDB are more recognized in the whole NoSQl field.

For Mongo, there are already parts of this ID series of blog posts, while the relevant parts of Hbase have not yet been added and need to be added.

High reliability protocol

1:Cassandra uses gossip as the communication protocol of the nodes in the cluster, in which the whole node is in the same position and there is no master or slave.

This makes the introduction of any node will not lead to the failure of the whole cluster.

2:Cassandra and Hbase in the underlying architecture design is to learn from the idea of Google Big Table to build their own systems, and

Cassandra's innovation is to introduce P2P (Peer to Peer), which was originally used in file sharing architecture, into NoSql.

One of the major characteristics of P2P is decentralization, and all nodes in the cluster have equal status, which greatly avoids the exit of a single node.

In contrast to the possibility that the whole cluster cannot work, Hbase adopts the Master/Slave method, which leads to the possibility of single point of failure.

1.2: high scalability

With the passage of time, the original size of the cluster is not large enough to store the newly added data, and now the data of NoSql has been implemented.

Cascaded extension, it is very easy to add new nodes to existing clusters, and the operation is simple.

1.3: final consistency

Emphasize the ultimate consistency again here: Cassandra adopts the ultimate consistency:

The ultimate consistency refers to multiple copies of a data object in a distributed system, although inconsistencies may occur in a short period of time.

After a period of time, these copies will eventually be consistent.

Cassandra gives priority to ensuring AP, which is what we often talk about as usability, and partition fault tolerance.

Generally speaking, most NoSQL key-value schemas are very efficient for writing, after all, for the insertion of large amounts of data.

However, in terms of data reading, it varies from situation to situation.

1: if the specified key is read individually, the result will be returned very quickly. -"specifies the key query.

2: if it is a range query, because the target of the query may be stored on multiple nodes, this requires multiple nodes

To query, the speed will be relatively slow.

3: full table scan. There is no doubt that scanning data from a full table will be very inefficient.

Data model

Because Cassandra uses a data structure similar to that of BigTable, Cassandra and Hbase are similar. So Cassandra has many of the same concepts compared to Hbase.

If you look at Cassandra abstractly. The whole Cassandra is a five-dimensional space.

1:column

Column: column, which is the smallest data unit in Cassandra, contains a ternary data type:

1: {/ / this is a column 2: name: "Beautiful New World", 3: value: "gpcuster@gmali.com", 4: timestamp: 123456789 5:} upercolumn: Super column

Think of SuperColumn as an array of Column, which contains a name. And a series of Column

Represent a SuperColumn in the form of JSON as follows

{/ / this is a SuperColumn 2: name: "Beautiful New World", 3: / / contains a series of Columns 4: value: {5: street: {name: "street", value: "1234 x street", timestamp: 123456789}, 6: city: {name: "city", value: "san francisco", timestamp: 123456789} 7: zip: {name: "zip", value: "94107", timestamp: 123456789}, 8:} 9:}

Both Columns and SuperColumns are a combination of Key Value, a name and a String. The biggest difference is that Column's Value is a "String", while SuperColumn's value is a Cloumns's Map.

1:Column family

Column family is a structure that contains many lines [Row]. You can think of it as a Table in RDBMS. Each line contains the Key provided by Clinet and a series of Column associated with that KEY. We can see the following structure:

1: UserProfile = {/ / this is a ColumnFamily 2: phatduckk: {/ / this is the corresponding ColumnFamily key 3: / / this is the corresponding Column 4: username: "gpcuster", 5: email: "gpcuster@gmail.com", 6: phone: "6666" 7:} under Key / / the first row ends 8: ieure: {/ / this is another key of ColumnFamily 9: / / this is the corresponding column 10: username: "pengguo", 11: email: "pengguo@live.com", 12: phone: "888" 13: age: "66" 14:} 15:} 1: UserProfile = {/ / this is a ColumnFamily 2: phatduckk: {/ / this is the key 3: / / of the corresponding ColumnFamily, and this is the corresponding Column 4: username: "gpcuster", 5: email: "gpcuster@gmail.com", 6: phone: "6666" 7:} under Key. / / the first row ends 8: ieure: {/ / this is another key of ColumnFamily 9: / / this is another Key corresponding to column 10: username: "pengguo", 11: email: "pengguo@live.com", 12: phone: "888" 13: age: "66" 14:}, 15:} the type of ColumnFamily can be Standard It can also be of type Super. The example we just saw is a ColumnFamily of type Stand. In addition, there is another form, that is, not each row is not just a column, it may be a CF.

As follows:

1: AddressBook = {/ / this is a Super type ColumnFamily 2: phatduckk: {/ / key 3: friend1: {street: "8th street", zip: "90210", city: "Beverley Hills", state: "CA"}, 4: John: {street: "Howard street", zip: "94404", city: "FC", state: "CA"} 5: Kim: {street: "X street", zip: "87876", city: "Balls", state: "VA"}, 6: Tod: {street: "Jerry street", zip: "54556", city: "Cartoon", state: "CO"}, 7: Bob: {street: "Q Blvd", zip: "24252", city: "Nowhere", state: "MN"} 8:... 9:}, / / row ends 10: ieure: {/ / key 11: joey: {street: "An ave", zip: "55485", city: "Hell", state: "NV"}, 12: William: {street: "Armpit Dr", zip: "93301", city: "Bakersfield", state: "CA"}, 13:} 14:}

Note that there is also a concept of CL in Hbase.

3:keySpace

KeySpace is our outermost data structure. Usually, our application has only one KeySpace. To put it simply, keySpace, think of keySpace as the database in RDBMS.

The three-tier structure of a relative relational database:

Database-> table-> colum.

The structure of cassandra is as follows:

Keyspace- > column family > [column | super column]

After reading the above, have you mastered how to analyze the Cassandra model and architecture? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.