Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Introduction and comparison of OLAP and OLTP

2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

To: https://www.cnblogs.com/hhandbibi/p/7118740.html

Introduction to OLTP and OLAP

Data processing can be broadly divided into two categories: online transaction processing (OLTP) and online analytical processing (OLAP). OLTP is the primary application of traditional relational databases for basic, day-to-day transaction processing, such as banking transactions. OLAP is the main application of data warehouse system, supporting complex analysis operations, emphasizing decision support, and providing intuitive query results.

OLTP system emphasizes the efficiency of database memory, the command rate of memory indexes, binding variables and concurrent operations.

OLAP system emphasizes data analysis, SQL execution market, disk I/O, partition, etc.

Comparison between OLTP and OLAP:

OLTP, also known as Online Transaction Processing, represents a highly transactional system, generally a highly available online system, dominated by small transactions and small queries. When evaluating its system, it is generally based on the number of Transactions and Execute SQL executed per second. In such a system, a single database often processes more than a few hundred transactions per second, or thousands, and the Select statement executes thousands or even tens of thousands per second. Typical OLTP systems include e-commerce systems, banks, securities, etc., such as eBay's business database in the United States, which is a typical OLTP database.

The most common bottleneck in OLTP systems is the CPU and disk subsystems.

(1) CPU bottlenecks are often manifested in the total number of logical reads and computational functions or procedures. The total number of logical reads is equal to the logical reads of a single statement multiplied by the number of executions. If the execution speed of a single statement is very fast, but the number of executions is very large, then it may also lead to a large total number of logical reads. The design approach and optimization approach is to reduce the logical read of individual statements, or to reduce the number of times they are executed. In addition, some computational functions, such as custom functions, decode, etc., frequently used, will also consume a lot of CPU time, resulting in increased load on the system, the correct design method or optimization method, you need to avoid the calculation process as much as possible, such as saving the calculation results to the statistical table is a good method.

(2) Disk subsystem In OLTP environment, its carrying capacity generally depends on its IOPS processing capacity. Because in OLTP environments, disk physical reads are generally db file sequential reads, that is, single block reads, but the number of reads is very frequent. Big performance issues arise when it is so frequent that no disk subsystem can handle its IOPS.

OLTP is more commonly used design and optimization methods for Cache technology and B-tree index technology, Cache determines that many statements do not need to obtain data from the disk subsystem, so Web cache and Oracle data buffer are very important to OLTP system. In addition, in terms of index use, the simpler the statement, the better, so that the execution plan is stable, and must use binding variables, reduce statement parsing, minimize table association, minimize distributed transactions, basically do not use partition technology, MV technology, parallel technology and bitmap index. Because concurrency is high, batch updates should be submitted quickly in batches to avoid blocking.

OLTP system is a system where data blocks change very frequently and SQL statements are submitted very frequently. For data blocks, data blocks should be stored in memory as much as possible. For SQL, variable binding technology should be used as much as possible to achieve SQL reuse, reduce physical I/O and duplicate SQL parsing, thus greatly improving database performance.

In addition to binding variables, there may be hot blocks that affect performance. When a block is read by multiple users simultaneously, Oracle uses Latch to serialize user actions in order to maintain data consistency. When one user gets a latch, other users have to wait, and the more users get the data block, the more obvious the wait. That's the problem with fast heat. This hot block may be a data block or a rollback end block. For data blocks, it is usually caused by uneven data distribution in the database. If it is an indexed data block, you can consider creating a reverse index to achieve the purpose of redistributing data. For rollback segment data blocks, you can appropriately add several rollback segments to avoid such contention.

OLAP, also known as online analytical processing (Online Analytical Processing) system, sometimes also known as DSS decision support system, is what we call a data warehouse. In such systems, the amount of statement execution is not the criterion, because the execution time of a statement can be very long and the data read is very large. Therefore, in such a system, the evaluation criteria are often the throughput (bandwidth) of the disk subsystem, such as how many MB/s of traffic can be achieved.

The throughput of the disk subsystem often depends on the number of disks. At this time, Cache basically has no effect. The read and write types of the database are basically db file scattered read and direct path read/write. Try to use a large number of disks and a relatively large bandwidth, such as a 4Gb fiber optic interface.

Partition technology and parallel technology are often used in OLAP system.

The importance of partitioning technology in OLAP system is mainly reflected in database management, such as database loading, which can be realized by partition exchange, backup can be realized by backup partition table space, deletion data can be deleted by partition, and as for the impact of partition on performance, it can make the scanning of some large tables very fast (only scanning a single partition). In addition, if the partition is combined with parallelism, it can also make the scanning of the entire table faster. In short, the main function of partitioning is administrative convenience, it does not absolutely guarantee the improvement of query performance, sometimes partitioning will bring performance improvement, sometimes it will be reduced.

Parallel technology in addition to the partition technology, in Oracle 10g, combined with RAC to achieve multi-node simultaneous scanning, the effect is also very good, can be a task, such as select full table scan, evenly distributed to multiple RAC nodes.

In OLAP systems, the BIND variable is not required because the overall system execution is small, analysis time can be ignored for execution time, and incorrect execution plans can be avoided. But OLAP can use bitmap index and materialized view in large quantities. For large transactions, try to optimize the speed. There is no need to submit quickly like OLTP, or even deliberately slow down the execution speed.

The real use of binding variables is in OLTP systems, which are often characterized by high user concurrency, dense user requests, and most of the SQL requested can be reused.

For OLAP systems, most of the time reporting jobs are running on the database, performing SQL operations that are basically aggregations, such as group by, and setting the optimizer mode to all_rows is appropriate. For some website-like databases with a lot of paging operations, it is better to set first_rows. But sometimes for OLAP systems, where we have pagination, we can consider using hints in each SQL. For example:

Select a.* from table a;

Separate design and optimization

Special care should be taken in design, such as in a high-availability OLTP environment, not to blindly apply OLAP technology.

For example, partition technology, assuming that instead of using partition keywords extensively, other fields are used as where conditions, then if it is a local index, multiple indexes will have to be scanned, and performance will become even lower. If it is a global index, it loses the meaning of partitioning.

Parallel technology is also the case, generally used to complete large-scale tasks, such as in real life, translation of a book, you can arrange multiple people, each person translated different chapters, so that you can improve the speed of translation. If you translate only one page of a book, assign different people to translate different lines, and then combine them, there is no need, because in the time allocated to work, one person may have already finished translating.

Bitmap indexes are the same, if used in an OLTP environment, they can easily cause blocking and deadlock. However, in an OLAP environment, OLAP queries may be speeded up because of its unique characteristics. MV is also basically the same, including triggers, etc., in the OLTP system with frequent DML, it is easy to become a bottleneck, even Library Cache waiting, and in the OLAP environment, it may improve the query speed because of proper use.

For OLAP systems, there is little room for optimization in memory. Increasing CPU processing speed and disk I/O speed is the most direct way to improve database performance. Of course, this also means increasing system cost.

For example, we have to aggregate hundreds of millions or billions of pieces of data. This kind of massive data is difficult to operate in memory. At the same time, it is unnecessary, because these data are rarely reused, and there is no practical significance in caching them. Moreover, it will also cause considerable physical I/O. So the bottleneck of this system is often disk I/O above.

For OLAP systems, SQL optimization is very important because it has a large amount of data, and doing full table scans and indexes makes a big difference in performance.

other

The templates available during the pre-Oracle 10g library build process are:

Data Warehouse

General Purpose (General Purpose)

New Database

Transaction Processing

The following templates are available for Oracle 11g version building:

General purpose transactions

custom database

data warehouse

An individual's understanding of these templates is:

On-line analytical processing (OLAP), large data volume, less DML. Using Data Warehouse Templates

OLTP (On-line Transaction Processing), data volume is small, DML is frequent, parallel transaction processing is many, but generally very short. Use general purpose or transaction templates.

Decision support system (DDS), typical operations are full table scan, long query, long transaction, but the number of general transactions is very small, often a transaction exclusive system.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report