What's the difference between OLTP and OLAP? 05/09 Update SLTechnology News&Howtos

What's the difference between OLTP and OLAP?

2025-05-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article introduces the relevant knowledge of "what is the difference between OLTP and OLAP". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

The difference between OLTP and OLAP

OLTP (on-line transaction processing) translates to online transaction processing, or online transaction processing system

OLAP (On-Line Analytical Processing) is translated into online analytical processing, or online analytical system.

Literally, OLTP does transaction processing and OLAP does analytical processing. From the point of view of the operation of the database, OLTP mainly adds, deletes and modifies the data, and OLAP is the query of the data.

Difference:

OLTP is mainly used to record the occurrence of certain business events, such as purchase behavior. When the behavior occurs, the system will record who did what, when and where. Such a row (or multiple rows) of data will update the data in the database in the way of additions, deletions and changes, requiring high real-time performance, strong stability, and ensuring the success of timely data updates. Common business systems such as ERP,CRM,OA belong to OLTP.

When the data accumulates to a certain extent and we need to make a summary and analysis of what happened in the past, we need to take out the data generated in the past period of time for statistical analysis, so as to obtain the information we want and provide support for the company's decision-making. at this time, we are doing OLAP.

Because the business data generated by OLTP is scattered in different business systems, and OLAP often needs to centralize different business data together for unified and comprehensive analysis, it is necessary to clean the corresponding data according to business analysis requirements and store them in the data warehouse, and then the data warehouse will provide OLAP analysis. So we often say that OLTP is the application of database and OLAP is the application of data warehouse. Here is a brief comparison with a picture.

Common operations in multidimensional analysis:

Here are the five most common operations in a data cube: slicing, slicing, rotating, rolling up, and drilling down.

Drill-down: the change between the different levels of dimension, from the upper to the next, or split the summary data into more detailed data, such as drilling through the total sales data of the second quarter of 2010 to view the monthly consumption data of April, May, and June of 2010, as shown above; of course, you can also drill into Zhejiang Province to see Hangzhou, Ningbo, Wenzhou. Sales figures for these cities.

Volume 1 (Roll-up): the reverse operation of drilling, that is, the aggregation from fine-grained data to high-level data, such as summarizing the sales data of Jiangsu, Shanghai and Zhejiang provinces to view the sales data of Jiangsu, Zhejiang and Shanghai, as shown in the figure above.

Slice: select specific values in the dimension for analysis, such as selecting only sales data for electronic products, or data for the second quarter of 2010.

Dice: select a specific range of data in the dimension or a batch of specific values for analysis, such as sales data from the first quarter of 2010 to the second quarter of 2010, or sales data for electronics and commodities.

Pivot: the interchange of the position of a dimension is like the conversion of rows and rows of a two-dimensional table, such as the interchange of product and region dimensions through rotation in the figure.

After investigating the mainstream open source OLAP engines in the market, it is found that there is not a system that can meet the query needs of various scenarios. The essential reason is that no system can be perfect in terms of data volume, performance, and flexibility at the same time, and each system needs to make a choice among these three in its design.

The system of MPP architecture (Presto/Impala/SparkSQL/Drill, etc.) has a good amount of data and flexibility support, but there is no guarantee for response time. As the amount of data and computational complexity increases, the response time slows down, from seconds to minutes, or even hours.

MPP is massively parallel processing (Massively Parallel Processor). In the database non-shared cluster, each node has an independent disk storage system and memory system, and the business data is divided into each node according to the database model and application characteristics. each data node connects to each other through a private network or a commercial general network, collaborates with each other and provides data library services as a whole. Non-shared database cluster has the advantages of complete scalability, high availability, high performance, excellent performance-to-price ratio, resource sharing and so on.

Disadvantages: unstable performance

Compared with the MPP system, the search engine architecture system (Elasticsearch, etc.) converts the data into an inverted index when entering the database, uses the Scatter-Gather computing model, sacrifices flexibility for good performance, and can achieve subsecond response on search queries. However, for the query based on scan aggregation, with the increase of the amount of data processed, the response time will be reduced to the level of minutes.

Disadvantages: unstable performance

The precomputing system (Druid/Kylin, etc.) pre-aggregates the data when it is stored in the database, further sacrificing flexibility for performance, in order to achieve second-order response to super-large data sets.

Disadvantages: inflexible

MPP and search engine systems cannot meet the performance requirements of very large data sets, so it is natural to consider precomputing systems. Druid is mainly for real-time Timeseries data, although we have similar scenarios, but the mainstream analysis is still for structured tables produced on a daily basis in data warehouses, so Kylin's MOLAP Cube scheme is the most suitable engine for large amounts of data.

This is the end of the content of "what's the difference between OLTP and OLAP". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.