In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
Introduction to the author
Pang Kuo, the manager of the basic department of operation and maintenance of Youpengpule Media. Responsible for database operation management and platform design and development, monitoring design improvement, problem tracking, computer room network maintenance and management. At present, four patents have been applied for by the Patent Office. Good at database operation and maintenance management and Shell, Perl, PHP writing.
Recently, there are many problems about database failures, no matter large or small companies have high requirements for data backup, but it is more urgent to verify the effectiveness of data backup, and many companies have formed a system for automatic backup and restore. however, the verification of backup validity after restore may not be perfect, and there is no perfect verification mechanism on the Internet (maybe I haven't found it).
The selection of methods or samples for checking the validity of database backup is directly related to the quality index of backup data. This article will share a design I made, which directly uses online SQL to extract select, including complex join type SQL plus existing database and table information to improve the accuracy of backup verification.
This is a scheme I deduced when I applied for a patent related to the database. This design appeared when looking for a good measure of data after checking backup and restore, and accidentally combined with backup restore. As there are more and more database instances, the need for this validity check will become stronger and stronger.
The following will briefly introduce the design of my verification data, maybe it can give you an idea or idea, of course, I also hope to have other good plans to learn together. (note: some of the information has been desensitized)
System processing flow
The program processing flow is as follows:
According to the above flow chart, it is roughly divided into five steps, there are six scripts to complete the process, each step is not very difficult, in practice, it can be improved according to their own business-specific. Below, I will briefly introduce the main functions of this process.
Function introduction
Automatic backup function
(it can be set by itself. I am a configured scheduled task, and the platform is in docking.)
Automatic restore function
Automatically download backup and restore.
SQL and automatic reporting function of database table
1) reporting the database table information of the local database is mainly used to compare whether the database table information corresponds one to one after the restore, and if the correspondence is normal, otherwise abnormal, alarm processing is carried out.
2) to report to SQL, in order to ensure the authenticity of SQL, this method is to monitor general_log, obtain Select type SQL after analysis, and report the obtained sql and checked values to the data center as a sample SQL by performing this SQL reduction.
Automatic comparison function of database table and SQL after restore
1) after the restore, the database center is automatically called to obtain the database table information and compare it one by one.
2) obtain the SQL information to match and check the original and restored data values. If they correspond, they are normal, otherwise they are abnormal.
Note: in the following demonstration, the manual form can be set to automatic according to the specific situation of the company.
Environment introduction
Database machine: 172.16.20.5
Backup machine: 172.16.20.6
Restore machine: 172.16.20.7
Backup tool: mydumper
Programming language: Shell+Perl
Backup transfer tool: rsync
Deployment
1. Backup machine rsync deployment
The following solutions have been taken before backing up the data center. Let me sum it up briefly:
NFS: the network remote mount is carried out by a device. You only need to install the NFS service, which is easy to operate. But one problem is that you hang when you use the disk when there is a problem with the NFS service or a network outage.
FTP: FTP has also been used for backup service, but sometimes login failure occurs. It is complicated to set permissions for different directories and is not easy to maintain. It is also not very convenient to upload, download and write scripts.
Rsync: changed to Rsync, mainly because the configuration is simple, uploading and downloading is much easier, just one command; it is very useful for incremental transmission.
The important parts are as follows:
[back5]
Path = / opt/mysql_bak/172.16.20.5
Comment = www file
Ignore errors
Read only = false
List = false
Uid = root
Gid = root
2. Install mydumper on database machine and restore machine
Mydumper third parties open source tools for multithreaded backup and recovery of MySQL databases. Developers are mainly from MySQL, Facebook and SkySQL, currently developed and maintained by Percona, which is an important part of the Percona Remote DBA project; different from the official mysqldump and mysqlpump, it uses multithreading for database table backup and restore, which is a good choice for fast backup and recovery; of course, percona's xtrabackup is equivalent to a physical backup tool, but it consumes a lot of space.
Please refer to the official website: https://launchpad.net/mydumper/+download for installation
3. Execute the backup script on the database
The script is as follows:
4. Data center table structure design.
Create the following tables in the data center, which are mainly used to store the database table information and SQL information reported during backup, and provide sample values when you restore the verification with the following steps.
Table structure of database table report
SQL table structure
5. Report on the database machine
1) Database table reporting program address: download and modify by yourself
Https://github.com/kevin6386/db_table_report/blob/master/db_table_report
Just run it.
2) SQL reporting procedure
Program address: https://github.com/kevin6386/db_sql_report/blob/master/db_sql_report
Just run it.
6. Database backup and restore
Download the backup and restore (a brief introduction to decomposition):
Download the backup to local using rsync, and decompress it
Rsync-zrtoapg-progress root@172.16.20.6::back5/ backup file name. /
Restore command:
/ usr/local/bin/myloader-u user-p pass-o-d backup address-t 8
7. Check
At this time is the focus of the whole process design, for the restored data, how to do verification is important, and the samples or methods of verification are directly related to the indicators of the effectiveness of data backup.
1) check the database table after restore
Program address: https://github.com/kevin6386/db_table_diff/blob/master/db_table_diff
The comparison results are as follows:
Email screenshot
2) check of data SQL after restore
Program address: https://github.com/kevin6386/db_sql_diff
The comparison results are as follows:
Email screenshot: if normal, the attachment will have SQL, otherwise it will be empty.
Abnormal screenshot
There are several situations in which an exception occurs:
There is a time difference between backup and general_log extraction; it will occur when the acquisition SQL occurs before or after the backup when the data is modified. (samples can be extracted with low-peak or rarely modified fields)
Some tables were restored unexpectedly and data was lost. (for example, I have encountered a trigger where the table is dependent on the table.)
I use the backup of the slave database to compare the SQL of the master library. (it is possible that the slave library is inconsistent with the master library)
There are lost tables or records during the backup. (sometimes backup command problems or missed backup)
Attachment SQL information
8. Report on backup
I report the backup size and file name every day, and then write SQL to compare today's backup with the information of the previous 2 days.
As follows:
Summary
After designing this scheme, it took me some time to write the sub-program. At the same time, I thanked my colleagues for helping me test the design repeatedly. I found that the problems in the backup and restore process had improved a lot. The important thing is that there is no need to manually extract the restored data results. When this scheme is fixed, there is little human participation, which reduces the repetitive work of manual backup restore and backup verification, and can know exactly which part is wrong, reducing the worry about whether the database backup is normal or not. Of course, there are still many aspects to be improved. Interested friends are welcome to put forward suggestions and communicate together in the message area.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.