In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article introduces the relevant knowledge of "what is the monitoring method of table data fluctuation and code value distribution fluctuation in big data development". In the operation of actual cases, many people will encounter such a dilemma. Next, let the editor lead you to learn how to deal with these situations! I hope you can read it carefully and be able to achieve something!
Design Summary:
Task execution, monitoring and alarm can be completely separated in design. Separation allows task execution to only do as much as possible. Monitoring can carry out data statistics and data distribution according to a variety of monitoring rules. Alarm focuses on how to customize and flexibly alarm according to the monitoring results. In the design, monitoring can be taken as the main body, and task execution and alarm can be customized according to the requirements, so as to better meet the needs of all parties. The design of monitoring rules temporarily starts from the following aspects: the generation of data partition, the amount of data of data partition, the fluctuation of data volume of data partition, and the fluctuation of code value distribution of table data field. The main responsibility of monitoring is to run the number of runs, run out the data needed for alarm, and alarm, according to the output data of monitoring and monitoring configuration to generate done files or undone and alarm.
The done directory and the original table directory are similar to the table / version / partition or date / done/a.done or b.done or c.done (generated based on the configured root directory + the second half of the table path)
1. Table data monitoring
Monitor what? What is the goal to be achieved?
Monitoring has two main purposes, one is alarm, the other is interception, interception in order not to continue to go down when problems occur, so generally configured to intercept there must be alarm, there is no need to intercept alarm, such as delay alarm.
1.1 enter what
The scheduling time of a scheduling platform, measured in days, is finally reflected in the partition field of the batch log, which can support backtracking. Other configuration information needed for monitoring is in the main table of the table data monitoring, and some special configuration information will be used in ancillary tables, such as distribution sub-tables.
1.2 calculation model
1.2.1 whether show partitions is generated in the partition | grep xx
1.2.2 the number of partitions is greater than a certain threshold. Default is 0:select count (*) from db_table where {db_table_date_column} = f ($input_date) and version=20201205
1.2.3 fluctuation in the number of partitions: (count of the number of partitions-average count of previous days) / average count of previous days
1.2.4 Monitoring of fluctuation of data code value distribution
How to measure fluctuation of data distribution
Suppose the code value and data distribution of a metric are as follows:
2020-12-052020-12-062020-12-07a 10% a 9% a 1% b 50% b 51% b 90% c 40% c 40% c 9%
We can see that on 12.07, there is a large fluctuation, so we need to make an early warning, ask how to measure this fluctuation, and set up an early warning.
Think of a _ c as a vector, such as the average vector of the last week (excluding the current day).
$a1B1B1BI c1 $$
And then calculate the current vector
$a0rect b0rem b1-A1 recorder b1rec c 1 / (A1 rect b1rec c 1) = a3rect b3je c3 $$
Calculation model of data distribution fluctuation
Key current day vector
$a = (xQuery yjue z...) $$
$$b = (x1 ~ y1 ~ Z _ 1....) $$
So, wave vector.
$$c = (aMub) / a $$
The final result
$$c = (x2meno y2jinz2...) $$
two。 Alarm design
Each start of the alarm task can rely on the log monitored by the partition data of the same day to run the batch partition, that is, at least run the batch log to start the alarm task. The alarm input is the monitoring master table and the monitoring batch table, the output done,undone & alarm, in the alarm log.
3. Overall design
Use platform routine tasks to schedule monitoring tasks, use mysql development environment to read configuration, use gp to store result data, use platform synchronization function to synchronize the same structure of hive result table to gp for report display, the whole process supports backtracking.
Table data monitoring configuration table:
-- General table create table table_monitor_conf (db_table string, table_charge_people string comment 'table owner', done_path string comment 'done file output location prefix', where_condition string comment 'where clause content eg:version=20201201 and dt=#YYYYMMdd#', if_done string comment' master switch: whether to generate done file 'if_check_partition string comment' Rule 1: whether to monitor output partition' If_check_partition_count string comment 'rule 2: whether to monitor the output partition data volume', if_check_partition_count_fluctuates string comment 'rule 3: whether to monitor the output partition data volume fluctuation', if_check_distribute string comment 'rule 4: whether to monitor the output table data distribution fluctuation')-- use create table table_monitor_distribute_conf (db_table string comment 'table name' when the distribution subtable if_check_distribute is 1) With_code_value_keys string comment 'keys:k1,k2,k3', no_code_value_keys string comment with code value' keys:k1,k2,k3' without code value)
Where the db_table = 'default.default' of table_monitor_conf is the default value for all configuration records.
Table data volume monitoring run batch record:
Create table table_monitor_records (run_db_table string comment 'run batch form Source table_monitor_conf 's db_table', check_date_time string comment 'actual task run time-program generation', run_check_partition string comment 'rule 1 output: according to whether where_condition produces partition' run_check_partition_count bigint comment 'rule 2 output: number of tables run according to where_condition' Run_check_partition_count_fluctuates string comment 'rule 3 output: data fluctuation of table data relative to the average of a week ago', run_check_distribute_json comment 'rule 4 output: large json', run_check_distribute_fluctuates comment of data distribution' rule 4 output: large json of data distribution relative to weekly average fluctuation large json') partition by (dt string comment 'data run batch partition The platform input') comment 'monitoring run batch record table'
Alarm configuration table
Create table table_monitor_notify_conf (db_table string comment 'database table', notify_enable string comment 'whether to turn on this alarm', normal_produce_datetime string comment 'normal generation time of table data', check_count_threshold bigint comment 'threshold for monitoring output partition data volume', check_count_fluctuates_threshold double comment 'monitoring output partition data fluctuation threshold', check_distribute__json_threshold double comment 'table data distribution threshold')
Alarm log table
Create table table_monitor_notify_records (db_table string comment 'which table has a problem', view_url string comment 'page display address', table_charge_people string comment 'table owner', trouble_description string comment 'what's the problem', check_date_time string comment 'alarm time-program generation',) patition by (dt string comment 'data run batch partition, platform input')
Write out data, done files, undone files, each table, only one per partition
Data distribution, in the distribution of fluctuations in the first run of data, will write a
There are a total of several tasks: monitoring task, alarm task, 1: 00 every day-> 8: 00 p. M., once every 10 minutes
Other:
Hi_email_message_phone string comment 'alarm mode, reserved field' zhiban_people string comment 'duty supervisor, reserved field'
TODO:
[] increase the number of people on duty, upgrade the alarm mode,
[] call the police according to dependency
[] through the robot in the group, the alarm log table can be operated to achieve the function of alarm suspension for a period of time. Big data development.
This is the end of the content of "what is the monitoring method of table data fluctuation and code value distribution fluctuation in the development of big data". Thank you for your reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.