In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
With the development of enterprises, will gather more and more data internally. How to ensure the consistency, accuracy and fast service ability of different business data of the whole enterprise is a problem that every enterprise will face when the data accumulates to a certain amount. The following content is the governance of our company, for your reference only.
At present, the cumulative data of is P-level, the daily new data is T-level, and the data are mainly structured data and semi-structured data. Hive is used to build a data warehouse for data processing, and at the same time, it is carried out in strict accordance with the construction standards of the warehouse, so as to ensure a clear data hierarchy in the warehouse, complete consistency of data between layers, and the accuracy of data generated by hive daily run and batch. The acquisition of data is provided to the final application side. In the data warehouse, we divide it into four layers: source data layer, operation data layer, data warehouse common layer and market layer.
1. Data sources include real-time data and offline data. Real-time data is collected through kafka+storm, and offline data is obtained from various services through sqoop. After data collection, data will be entered into stage and ods,stage areas according to their own characteristics. After data processing, they will also be stored in ods, and the latter services will be provided by ods (what kind of data is put into stage area will be discussed later)
two。 The data of the common layer of the data warehouse comes from ods, which consists of the detail layer (DWD) and the summary layer (DWS). The data of the summary layer is divided into mild summary data (DWB) and heavy summary data (DWS).
3. The data of data Mart layer (DM) is mainly used to provide services for business requirements, including data required by application products, demand reports, indicators, etc., at the same time, this layer can also create a dedicated database and data exploration database for business departments.
Below, the ods layer, dwd layer, dws layer and DM layer in the warehouse are shown separately:
Ods layer construction:
The log data collected from both the web side and the app side of will be uniformly transmitted to kafka, and then simply processed by storm and stored in the stage database. For tables with a large medium weight in the business database (such as financial payment-related tables, which have hundreds of millions of records every day, and there are updates to the transaction records in the last 60 days), we first put them into stage by incremental extraction to ensure that the data can be quickly entered into the warehouse every day, and then put into the ods database after processing according to business needs.
Dwd layer construction:
dwd layer is a very important layer of data warehouse, and the main source of upper data, and also an important entrance to complete data management, so there are many requirements for data processing in this layer. We mainly carry out construction in the following ways:
(1) the data granularity is locked. The data granularity of this layer is the same as that of the ods layer, and both are detail-level data. The difference is that the test data will be eliminated in this layer.
(2) the data standard is unified, and this layer needs to ensure that the data with the same meaning has a consistent standard, such as name, no matter what it is called in the previous business database, the field in any table in this layer is uniformly named name;. For gender data, some business records are Fgamma M, some records are male / female, and any table in this layer is stored as male / female.
(3) if there is sensitive data such as phone number and × × number in the original data, desensitization is required at this layer, and then the data recorded in the table are all desensitized data, and a special sensitive database is established to backup and store the original sensitive data.
(4) Multi-source data integration, for the same type of data, but because the business needs to store the data in different databases and tables in the business database, it needs to be unified into one table in this layer.
(5) since it is the construction of data warehouse, it should be carried out according to the construction mode of data model. The data model of this layer is modeled uniformly based on business process, using kimball dimension modeling to deal with changeable business, and Inmon paradigm modeling to deal with topics with high stability.
Dws layer construction:
The dws layer contains two parts of data, mild summary data (dwb) and heavy summary data (dws). Mild summary data mainly stores simple statistical data based on users, products, protocols, and other business lines. Heavy summary data sources include dwd and dwb, which mainly record multi-service statistics and statistics with a large time span, and most of the dws data exist in wide tables.
Dm layer construction:
The data in the data Mart layer is mainly used as a direct data service. The data service objects include: analysis report, application product data, BI, tableu, exploration data and so on. According to the needs, this layer can create multiple databases to meet the business needs.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.