Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to understand the row storage and column storage of database

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

How to understand the row storage and column storage of the database, many novices are not very clear about this, in order to help you solve this problem, the following small series will explain in detail for everyone, people who have this need can learn, I hope you can gain something.

Traditional relational databases, such as Oracle, DB2, MySQL, SQL SERVER, etc., use row-based storage. In the database based on row-based storage, data is stored according to row data as the basic logical storage unit, and the data in a row exists in the storage medium in a continuous storage form.

We know that today's data processing can be roughly divided into two categories, on-line transaction processing (OLTP) and on-line analytical processing (OLAP). OLTP is the main application of traditional relational database, which is used to execute some basic and daily transaction processing, such as adding, deleting, modifying and checking database records; OLAP is the main application of distributed database, which has no high requirement for real-time performance, but processes a large amount of data, which is usually applied to complex dynamic report systems.

The main differences between OLTP and OLAP

OLTP and OLAP, why is there a significant difference in database application categories? In fact, this is caused by different database storage modes.

Row storage and column storage

Traditional relational databases, such as Oracle, DB2, MySQL, SQL SERVER, etc., use row-based storage. In the database based on row-based storage, data is stored according to row data as the basic logical storage unit, and the data in a row exists in the storage medium in a continuous storage form.

Column-based storage is relative to row-based storage, and new distributed databases such as Hbase, HP Vertica, EMC Greenplum, etc. all adopt column-based storage. In column-based databases, data is stored as a basic logical storage unit, and data in a column exists in a continuous storage form in the storage medium.

Applicable scenarios for row storage include:

1. Suitable for random addition, deletion and correction operations;

2. Query operations that need to select all attributes in the row;

Operations that require frequent inserts or updates are more dependent on index and row size.

In practice, we will find that there is an inherent "defect" in reading data from a row database. For example, even if the selected query targets only involve a few attributes, because these target data are buried in each row data unit, and the row unit is often very large, the application program must read every complete row record, thus greatly reducing the reading efficiency. In this regard, the optimization scheme given by row database is to add "index." In OLTP type applications, the query operation steps can be simplified and the query efficiency can be improved by means of index mechanism or table partitioning.

But for OLAP applications with massive data background (such as distributed database, data warehouse, etc.), the database stored in rows is somewhat "inadequate." It takes a lot of time and resources to establish indexes and materialized views in row-based databases, so it is not worth the loss. It cannot fundamentally solve the problems of query performance and maintenance cost, and it is not suitable for application scenarios such as data warehouses. Therefore, databases based on column-based storage appear later.

For data warehouse and distributed database, in most cases, it will summarize data from various data sources, and then analyze and feed back. Most of its operations are carried out around the data of the same column attribute. When querying the data records of a certain attribute, the column database only needs to return the values related to the column attribute. In the scenario of large data volume query, the column database can efficiently assemble the values of each column in memory and finally form a relational record set. Therefore, it can significantly reduce IO consumption. And reduce query response time, very suitable for data warehouse and distributed applications.

Examples of column storage engine applications include:

1. During the query process, operations for each column can be executed concurrently (SMP), *** aggregating complete record sets in memory, *** may reduce query response time;

2, can efficiently find data in the data column, without maintaining the index (any column can be used as an index), the query process can minimize irrelevant IO, avoid full table scanning;

Because each column is stored independently and the data type is known, the compression algorithm can be dynamically selected for factors such as the data type and data size of the column to improve the utilization of physical storage; if there is no data in a column of a row, then when the column is stored, the value of the column can not be stored, which will save more space than row storage.

Of course, like row databases, columnar storage has its own less applicable scenarios.

It mainly includes:

HarmonyOS Technology Community

Transaction scenarios where data needs to be updated frequently

A small database scenario with few column attributes in the table

Not suitable for real-time operations involving deletions and updates

With the development of column-based database, traditional row-based database has added column-based storage support, forming a database system with two storage methods. For example, with the introduction of Oracle 12c in memory component, Oracle database has a dual-mode data storage mode, which can support mixed-type applications. Of course, column-based databases also have support for row-based storage such as HP Vertica.

Did reading the above help you? If you still want to have further understanding of related knowledge or read more related articles, please pay attention to the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report