In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly talks about "what are the storage types of Nosql". Interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn "what are the storage types of Nosql"?
Status
If consistency and other theories are the link to build big data components and NosqL, then storage is equivalent to our data structure, according to the specific data structure, given the corresponding algorithm, its connotation is to achieve the overall processing model based on the specific storage and replica strategy.
General introduction to Storage
Traditional relational databases often use two-dimensional table structure to store and process data, the overall reason. The relational database model is relatively simple, agreed to implement, and the logic is clear.
When the data scale is relatively large, the above table structure can simply meet the business needs, but in today's massive data challenges, the two-dimensional table structure is no longer so efficient and flexible in some application scenarios. So from the point of view of database model, NOSQL proposed a new model.
Follow the brand-new design
NoSql usually does not require a fixed table structure based on a relational model, that is, there is no schema at all. Storage type of Nosql
1: key value type 1.1.1: data model
The main idea of the key-value storage model is that it comes from the hash table, where there is a specific key and a specific value in the hash table.
Key-> value
A pure keyvalue structure weakens the structure of the data. Usually, you only need simple operations such as get and set. The design of keyvalue is in the massive data processing, the biggest characteristic is that the model is simple, easy to implement, and is very suitable for querying and modifying data through key.
However, once the batch of data, batch update operations, keyvalue is at an obvious disadvantage in efficiency, similarly, because the design of the model does not support particularly complex logic data operations.
1.1.2: instance
Redis,Voldemort
1.1.3: application scenario
Memory type buffering, mainly used to handle the high access load of data, but also for some logging systems
1.1.4: advantages and disadvantages
Search quickly, the data is unstructured and is usually treated as string or binary data.
2: column type 2.1.1: data model
To put it simply, our traditional database is stored by "row", while our column database is stored by "column".
By row or by column, the data of the same column will be stored in the "page" of the same hard disk as much as possible, most of the hard disk pages are abstracted into a "column family" concept in the database, column family is to merge multiple columns into a group, and from a macro point of view, column family is somewhat analogous to value in keyvalue.
This kind of data model is more suitable for applications such as data analysis and data warehouse, which need to find quickly and have a large amount of data.
2.2.2: instance
Cassandra and Hbase
2.2.3: application scenario
A distributed file system that stores the same column of data together.
2.2.4: advantages and disadvantages
Rapid search, strong scalability, easier to expand distributed.
3: document type
3.1.1: data model
It is mainly stored in JSON or JSON-like documents, which is semantic.
3.2.2: instance
MongoDB
3.2.3: application scenario
Web application
3.2.4: advantages and disadvantages
Data requirements are not strict, there is no need to pre-define the structure, value refers to a structured data.
Can do real-time query, and lack of unified query syntax
Data partitioning and placement strategy
On top of the previous data storage model, we also consider the data partitioning and placement strategy, the main considerations are as follows:
1: the meaning of zoning
Everything must have its premise, condition, and reason for its existence. Partition technology, to put it simply, is a divide-and-conquer technology, which can be easily used to deal with super-large types of tables. By dividing large tables and indexes into manageable small blocks, it avoids treating each table as a large, separate object, which provides the possibility of scalability for a large amount of data.
The benefits of partitioning are as follows:
1.1.1: improve manageability
The most essential feature of partitioning is that the granularity of partitions is relatively smaller. To meet the previous relational database system era of "partition and control" measures, through partitioning, maintenance operations can be focused on the special part of the table. For example, you only need to back up some of the partitions instead of all the data.
1.1.2: improve performance
1.1.3: improve availability
1.1.2: improve manageability
After some partition processing for the database, through the processing of the partition, through the special optimization of the partition, we can also use many other advantages of the partition technology.
1.1.2.1: connections to partitions
During the join of two tables, you can join two tables together, and both tables are partitioned on the join button. Intelligent partitions can split large ones into small connections.
1.1.2.1: clipping of partitions
Partitioning Pruning of partitions, that is, the most effective way to improve performance through partitions, such as when you are in Hive or impala, you must filter out unnecessary partitions when selecting columns.
1.1.2.1: perform updates and deletes in parallel
Partitions can execute update, delete, and merge statements in parallel, and query statements and insert statements can be executed in parallel in the process of accessing database objects in different partitions.
In general, tables use "partition key" partitions. Partition key is a set of columns that determine the partition in which a row is located. There are "range partitions", "list partitions", "hash partitions", and "combined partitions".
2: range Partition
Each partition is specified by the key of a number of partitions, and the most widely used partition is partitioned according to the time range.
3: list partition
List partition. Each partition is specified by a column partition key. List partition applies to the discrete value of some columns.
4:hash partition
Hash partition, hash algorithm is applied to partition key and value, and is suitable for the case where partition data is basically evenly distributed.
5: range-hash combined partition.
That is, the combination of scope and hash partitioning technology, the table first partitions the range, and then partitions each range separately through the hash partition.
Partition operations can be performed in parallel to improve the availability of data. When some of the data is not available due to failure or other reasons, other partitions will not be affected.
Note that partitions are transparent to the application.
At this point, I believe you have a deeper understanding of "what are the storage types of Nosql?" you might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.