Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the difference between data warehouse and database

2025-04-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

This article will explain in detail the differences between data warehouse and database. The editor thinks it is very practical, so I share it for you as a reference. I hope you can get something after reading this article.

The difference between the data warehouse and the database: 1, the database stores the original data without any processing; while the data warehouse is designed to meet the needs of data analysis, the ETL process is carried out on the source data, and the data extraction work is divided into extraction, cleaning, conversion and loading; 2, the amount of data in the data warehouse is much larger than that in the database.

What is a data warehouse?

Data warehouse (Data Warehouse), which can be abbreviated as DW or DWH, is a planning process for decision-making at all levels of the enterprise, providing a strategic collection of all types of data types. It is created for analytical reporting and decision support purposes. For enterprises that need business intelligence, to guide business process improvement, monitoring time, cost, quality and control, etc.

What can a data warehouse do? (name a few chestnuts)

The formulation of annual sales targets needs to be made according to previous historical reports, not casually. Optimize business processes

For example, for mobile phones of a certain brand on an e-commerce platform, what is the age of the main buyers in the past 5 years, and there are more people to buy in that season? in this way, according to this characteristic, we can set their main demand and dynamic distribution of production for the target population, as well as the inventory of the warehouse.

Characteristics of data Warehouse

The data warehouse is topic-oriented.

Unlike traditional databases, data warehouses are topic-oriented, so what is a theme? The home theme is a concept of higher multiplication, which is the object of data synthesis, classification and analysis in enterprise information systems at a higher level. In the logical sense, he is the analysis object involved in a certain field of macro analysis in the enterprise. (speaking of human words: that is, the key aspects that users are concerned about when making decisions with data warehouses, a topic is usually related to multiple operational information systems, while the data organization of operational databases is oriented to transaction processing tasks. The tasks are isolated from each other; the data warehouse is integrated.

The data of the data warehouse is extracted from the original distributed database data (mysql and other relational databases). Operational database is very different from DSS (decision support system) analytical database. First, the source data corresponding to each topic of the data warehouse has many repetitions and differences in all the scattered databases, and the data from different online systems are bundled with different application logics. second, the comprehensive data in the data warehouse can not be obtained directly from the original database system. Therefore, before the data enters the data warehouse, it must be unified and integrated. This step is the most critical and complex step in the construction of the data warehouse. The tasks to be dug are as follows:

It is necessary to count all the contradictions in the source data, such as the objection of the same name of the field, synonym, non-uniform unit, non-uniform word length, and so on. Carry on the data synthesis and calculation. The data synthesis work in the data warehouse can be generated when the data is extracted from the original database, but many of them are generated within the data warehouse, that is, after entering the data warehouse. The data of the data warehouse changes over time.

The data in the data warehouse is non-updatable for the application, that is to say, the analysis and processing of the data warehouse users do not update the data. But this is not to say that all data warehouse data will never change during the entire life cycle from the data integration input data warehouse to the final deletion. The data of data warehouse changes with time, which is one of the characteristics of data warehouse. This feature mainly has the following three manifestations:

The data warehouse continues to add new data content over time. The data warehouse system must constantly capture the changing data in the OLTP database and add it to the data warehouse, that is, it must constantly generate a snapshot of the OLTP database and add it to the data warehouse through unified integration; but for the database snapshot that is really not changing, if the new changing data is captured, only a new database snapshot will be generated and added, but the original database snapshot will not be modified. The database constantly deletes the old data content over time. The data in the data warehouse also has a storage period, once this period has passed, the expired data will be deleted. It's just that the data time limit in the database is much longer than that in the operational environment. Generally speaking, in an operational environment, only 60-90 days of data are saved, while in a data warehouse, data with a long time limit (for example, 5-10 years) is needed to meet the requirements of trend analysis by DSS. The data warehouse contains a large number of comprehensive data, many of which are related to time, such as the data are often synthesized according to the time period, or sampled at regular intervals, and so on. These data need to be re-synthesized over time. Therefore, the data characteristics of the data warehouse contain time items to indicate the historical period of the data. The data in the data warehouse is immutable.

The data of the data warehouse is mainly used for enterprise decision analysis, and the data operation involved is mainly data query, and generally there is no modification operation. The data of the data warehouse reflects the content of historical data over a long period of time, the collection of database snapshots at different points in time, and the exported data based on these snapshots for statistics, synthesis and reorganization, rather than online processing data. The library for online processing in the database has been integrated into the data warehouse. Once the data stored in the data warehouse has exceeded the data storage period of the data warehouse, the data will be deleted from the current data warehouse. Because the data warehouse only carries on the data query operation, the system in the data warehouse is much simpler than the system in the database. Many technical difficulties in the database management system, such as integrity protection, concurrency control and so on, can almost be omitted in the management of the data warehouse. However, because the amount of data queried in the data warehouse is often very large, it puts forward higher requirements for data query, which requires the use of a variety of complex indexing techniques; at the same time, the data warehouse is aimed at the senior management of commercial enterprises. they will put forward higher requirements for the interface friendliness and data representation of data query. 2. The difference between data warehouse and database before we want to know the difference, we need to understand three concepts: what are database software, database and data warehouse?

Database software: is a kind of software (not a graphical client that links to a database). It is used to realize the logical process of database and belongs to the physical layer. Database: is a logical concept, used to store data warehouse, through the database software to achieve. The database consists of many tables, the table is two-dimensional, and there are many fields in a table. The fields are lined up one by one, and the data is written to the table on a row. The table of the database lies in the ability to represent multi-dimensional relationships in two dimensions. Such as: oracle, DB2, MySQL, Sybase, MSSQL Server and so on. Data warehouse: it is the upgrade of database concept. Logically, there is no difference between the database and the data warehouse, both are the places where the data are stored through the database software, but in terms of the amount of data, the data warehouse is much larger than the database. Data warehouse is mainly used for data mining and data analysis to assist leaders to make decisions; in the architecture of IT, the database must exist and there must be a place to store data. For example, today's online shopping and other e-commerce. How much is the inventory of the goods, the price of the goods, the balance of the user's account and so on. These data are stored in the background database. Or in the simplest sense, our current Wechat, Weibo and QQ accounts and passwords. In the background database must be a user table, there are at least two fields, that is, the user name and password, and then our data is stored on the table on a row. When we log in, we fill in the user name and password, and the data will be sent back to Taiwan to match the data on the table. If the match is not successful, an error will be reported. This is the database, which is used to work in the production environment. We use databases for everything linked to business-related applications. Data warehouse is one of the technologies under BI. Because the database is linked to business applications, it is impossible for a database to hold all the data of a company. The table design of database is often designed for a certain application. For example, in the login function just now, there are only these two fields on this user table, and there are no other fields. At that time, this table is in line with the should, there is no problem, but this table does not conform to the analysis. For example, I want to know in which time period, the number of users is the largest? Which user does the most shopping in a year? Indicators like that. It is necessary to redesign the table structure of the database. For data analysis and data mining, we introduce the concept of data warehouse. The table structure of the data warehouse is designed according to the analysis requirements, analysis dimensions and analysis indicators. The difference between a database and a data warehouse is actually the difference between OLTP and OLAP.

Operational processing, called online transaction processing OLTP (On-Line Transaction Processing), can also be called transaction-oriented processing system. It is aimed at the daily operation of specific business online in the database, usually query and modify a small number of records. Users are more concerned about the response time of operations, data security, integrity and the number of concurrent users. As the main means of data management, traditional database system is mainly used for operational processing. Analytical processing, called online analytical processing OLAP (On-Line Analytical Processing), generally analyzes historical data on certain topics and supports management decisions. Operational processing Analytical processing details synthesis or refined entity-relational (Emurr) model star model or snowflake model store instantaneous data storage historical data, do not contain the latest data updatable read-only, only append operation one unit at a time and one set at a time requires high performance Short response time, performance requirements loose transaction-oriented analysis, one operation, small amount of data, support decision-making, small amount of data, large customer order, inventory level and bank account query, customer income analysis, market segmentation three. tail

1. If there are any mistakes, you are welcome to point out that I will correct them in time. If you have anything you don't understand, you can leave a message to ask questions and communicate with each other.

2. Maybe you think it's okay, but I will take it seriously and regard it as my notes and experience, so that I can improve myself.

On the differences between data warehouse and database to share here, I hope the above content can be of some help to you, you can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report