In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces "what are the characteristics of C-Store". In daily operation, I believe many people have doubts about the characteristics of C-Store. The editor consulted all kinds of materials and sorted out simple and easy-to-use methods of operation. I hope it will be helpful for you to answer the doubts about "what are the characteristics of C-Store?" Next, please follow the editor to study!
Background knowledge
Row storage is the mainstream of database at that time, because it is suitable for OLTP scenario, it is called write-optimized, and the system for OLAP scenario is called read-optimized, such as data warehouse.
CPU is growing much faster than disk bandwidth, so you can sacrifice a certain amount of CPU in exchange for disk bandwidth.
There are two ways to do this: (1) coding (2) densepack, compact storage, which I understand is compression.
At that time, relational databases could not well support OLAP query-intensive scenarios. So the author proposes a new storage database C-Store, which contains a lot of content and is a hodgepodge, in which there are several new features: (1) write-optimized and read-optimized hybrid architecture (2) storage model, redundant data are stored in different order to support fast retrieval. (3) efficient compression, direct processing of compressed data (4) column query optimizer (5) data recovery (6) snapshot isolation to avoid 2PC
This paper introduces (1) (2) (5).
(1) Hybrid architecture
Optimizing writes and optimizing queries are mutually exclusive, such as storing data directly in write order, just like log appends, but this approach is not friendly to queries because queries may be faster in another order.
It is difficult for a model to be suitable for two scenarios, so the architecture of this paper is to make two modules. One module is responsible for handling fast writes, which is the WS above, and one module is responsible for providing efficient queries, namely the RS below, so you need some connectors, namely Tuple Mover, to synchronize the data in WS to RS.
The author's expectation is that WS is a small part of RS and can be stored in memory. In fact, this architecture is similar to LSM.
For simplicity, C-Store uses the same set of column storage engines to manage WS and RS, but stores more index information in WS to quickly locate data.
(2) Storage model
Projection:
What is the concept that each table can be bound to multiple projection? Each projection is a combination of some columns of the table, which is actually stored on disk, each projection can be stored in a different order, and each column of a table must appear in at least one projection. A projection bound to a table may also include columns from other tables (equivalent to repartitioning the table).
For example, a user table (name, age, salary) can be bound to two projection,P1 (name, age) order by age, P2 (name, salary), order by salary.
In this way, the two queries of finding names by age and by salary can be assigned to P1 and P2 respectively, each very quickly.
Because the columns are scattered, a row of data needs to be reorganized. Three concepts are involved here.
SID:Segment id, each projection level is divided into multiple segment partitions, and SID is the partition number.
SK:Storage keys, a self-increasing primary key assigned to each row of data in each partition, is used to align different projection, which is actually the line number and subscript.
The following figure is an example:
Join index: in order to reconstruct a complete row of data, you need to map these records in different order to the same order, which is the role of join index. For example, mapping projection2 to projection1 is an one-to-one mapping.
This join index can be a path, such as a projection3-to-projection2 mapping that is transitive. In this way, the data can be reorganized according to join index.
In the process of traversing the data, the traditional Iterator interface returned by point is changed to iterator returned in batches, each batch 64KB, which avoids too many method calls.
Data recovery
When the node dies but the data is not lost, you can restart the machine directly and take the operations in the execution queue of other machines to perform.
When both RS and WS of a node are lost, the data of that node needs to be reconstructed from the projections and join indexes of other nodes.
When only WS is lost, you can quickly recover from RS, which involves snapshot isolation and won't go into details.
Limitation
There is no specific explanation on how projection is generated or how to do load balancing.
The maintenance of join index is troublesome, especially when you join update, you also need join index to recover data, so you don't have the performance of error recovery.
In the completion of this article, the system has not been developed, the function is not complete, and it is still a stand-alone system.
At this point, the study of "what are the characteristics of C-Store" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.