Practice of real-time synchronization of Mysql data 07/19 Update SLTechnology News&Howtos

Practice of real-time synchronization of Mysql data

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

How much do you know about the database used internally by Xiaomi?

Background Mysql has become the most widely used database in Xiaomi because of its simplicity, efficiency and reliability, but when the amount of data reaches the level of 10 million / 100 million, the related operation of mysql will become very slow. If there is a need for real-time BI display, it will be a disaster for mysql.

In order to solve the business pain points that sql query is slow and untraceable, we explore a complete solution of real-time synchronization and impromptu query. This paper mainly introduces the related work from the perspective of real-time synchronization. The early business used Sqoop to synchronize the data in Mysql to Hive for data analysis, but it also brought some problems: although Sqoop supports incremental synchronization, it still belongs to coarse-grained offline synchronization. Unable to meet the real-time requirements, every time synchronous Sqoop sends data requests to Mysql in the way of sql, it also brings some pressure to Mysql to a certain extent, while Hive's support for data update is also relatively weak. In order to more effectively connect the front-end business data system (Mysql) and the back-end statistical analysis system (query analysis engine), we need a solution for real-time synchronization of mysql data. How can Xiaomi's internal practice achieve real-time data synchronization? We think of the binlog log used in Mysql master-slave replication, which records all DDL and DML statements (except for data query statements select, show, etc.), records in the form of events, and also includes the time spent by the statements executed. let's take a look at the principle of Mysql master-slave replication. there are mainly the following steps: master (master database) before each ready commit transaction to complete the data update Slave (slave library) initiates a connection to record changes in the binary log (binary log), connects to master, requests to get binlog files at the specified location master to create a dump thread, slaveslave that pushes binlog starts an I / O thread to read events in binary log on the master database, and records them in slave's own relay log (relay log). Slave also starts a SQL thread, which reads events from relay log and executes in the standby library. Complete data synchronization slave record your own binlog

Binlog records the real-time changes of Mysql data, which is the basis of data synchronization. What the service needs to do is to abide by the protocol of Mysql and disguise itself as the slave of Mysql to monitor the business slave database and complete real-time data synchronization. Combined with the characteristics of Xiaomi's internal system, the Mysql data synchronization service-LCSBinlog is constructed, which is integrated into Talos Platform as an independent data access mode. Talos Platform, as the basic solution of big data integration, takes self-developed message queue Talos as data bus and connects various systems as the main goal, providing rich data Source input and data Sink output, and Talos naturally supports streaming computing. Therefore, services can make full use of the characteristics of Talos Platform interconnection, and combined with their own business needs to achieve more high-level business scenarios.

The figure above shows the overall process architecture of Talos Platform, in which the red part is currently the most widely used link of LCSBinlog in Xiaomi: Mysql-- > Talos-- > Kudu-- > BI. After data synchronization to kudu, the Sparksql query engine provides ad hoc query services for the upper BI system. Details of the integration of Kudu and Sparksql can be found in the previous issue: bid farewell to "one after another disturbance"-Xiaomi OLAP service evolution LCSBinlog service has two roles: Master: mainly responsible for job scheduling, Worker: mainly to complete specific data synchronization tasks to run two jobs on Worker: BinlogSyncJob: each mysql library corresponds to such a Job Write the binlog log completely to the Talos topic created by the service MysqlSyncJob: synchronize historical data, consume binlog data, filter specific database table data for real-time synchronization to user-configured topic the service as a whole depends on Zookeeper to synchronize service status, record job scheduling information and mark job running status The progress control process of job synchronization is recorded in the kudu table as follows: the Worker node registers on Zookeeper that it can be scheduled. HA users who preempt EPHEMERAL temporary nodes on Zookeeper register BinlogSource synchronization tasks on the converged cloud (Web). Master periodically reads Binlog synchronization jobs from configuration services to configure Master synchronization jobs to update scheduling information in Zookeeper Worker nodes start new assignment tasks and stop configuring failure tasks based on the scheduling information on Zookeeper After the job starts, complete the real-time data synchronization and periodically record the synchronization progress in kudu. The service reports the monitoring information to the Falcon platform, and the job exits abnormally and sends an alarm email. How to ensure the correctness of the data? >

Sequentiality

Each BinlogSource configured by the user is bound to a Talos topic. When consuming, you need to ensure the sequence of the same mysql record operation. Message queuing Talos cannot guarantee the order of global messages, only the internal order of the partition. For BinlogSource configured with sub-database and sub-table or multi-database synchronization tasks, the service will hash the data according to the database table information and write the data to the corresponding partiton to ensure that the data of the same table is in the same partition, thus ensuring the order of the downstream consumption data. For single-table synchronization jobs, a partition is currently used to ensure that the data is orderly. >

Consistency

How to ensure that after the job exits abnormally, the job restart can completely synchronize the data in the mysql to the downstream system, mainly depends on the following three services will record the offset of job synchronization. The sequence of consuming Binlog data from the offset of the last commit after restart ensures that even if the data is repeatedly consumed (non-commit data), the downstream storage system kudu,Es can be executed in the same order for the operations of the same record. Redis primary key-based operations can ensure the final consistency of data after repeated playback in binlog. What can we do with this data? this section gives examples of several common application scenarios.

Update cache in real time

Business query services often set up a cache on top of mysql to reduce access to the underlying database; when the mysql database data changes, if the cache has not expired, it will get the expired data, and the business expects to update the cache in real time Using binlog service, the data is synchronized to redis in real time according to the policy, which ensures the validity of the data in the cache, reduces the calls to the database, and improves the overall performance.

Asynchronous processing, system decoupling

With the development of business, the same data may have different analytical purposes. When the data is successfully written to mysql, it also needs to be synchronized to other systems. If it is processed synchronously, on the one hand, the whole process of the transaction will be lengthened, on the other hand, the systems will influence each other that the data will be recorded in binlog only after successful operation in binlog, ensuring the consistency of downstream processing. Using binlog service to complete data delivery is helpful for system decoupling about asynchronous processing. System decoupling has a more in-depth interpretation in the article "thinking about the value of message queue" >.

BI system for impromptu query

As mentioned at the beginning of the article, due to the performance bottleneck of mysql in certain scenarios, after mysql data is synchronized to kudu, the performance can be improved with the help of sparksql, because it is also a sql interface, and the switching cost for users is lower. Data synchronization to more suitable storage for query can also avoid the impact of large queries on other queries in the original mysql library. At present, Xiaomi is running 3000 + synchronization jobs stably. Use binlog services to synchronize data to kudu XDATA, Xiaomi's internal BI star product, well supports the operation and sql analysis with the help of a complete set of synchronization processes. How to use Binlog data users to access data requires mysql library to open the binlog log format must be Row mode: record the value before and after the change of each field of each line record, although it will increase the amount of binlog data, but can ensure the accuracy of each record. To avoid data synchronization inconsistencies, finally, by monitoring binlog logs, LCSBinlog service converts the data into the following data structure and writes it to the user's registered Topic. At present, Sink service uses SparkStreaming to dump data to kudu in real time, and later will gradually migrate to Flink to improve resource utilization and reduce latency. Users can also consume Talos data in real time to achieve more complex business logic according to the data format provided by us. The following table shows each kind of data operation. Whether to save the list before and after modification. Share 2 interesting questions encountered after launch. >

Data inconsistency problem, business uses a unique index

After the service is connected for a period of time, it is found that there are occasionally more data entries in the kudu table than in the synchronized mysql table. After comparing the extra data with the binlog log generated by mysql, we find that the user sets a unique index in the mysql table and modifies the primary key through the unique index, while the data in kudu identifies or updates a record through the primary key, so the update operation becomes an insert operation. This caused the original one record to become two. Solution: for this type of table, the LCSBinlog service converts an Update operation into a piece of Delete data and a piece of Insert data >

Client timed out when Full Dump synchronized historical data

When the service is launched, the synchronization of all historical data is completed by performing sql through jdbc. In the process of synchronization, it will be found that the dump task will stutter for a long time before the result will be returned. When the amount of data is too large, the timeout synchronization fails, which will cause data delay. After investigation, it is found that when using the MySQL official jdbc to query data on the client, the default is to retrieve all the data from the server and put it in the client memory at a time, and the fetch size parameter does not work. When a SQL returns a large amount of data, there may be an OOM solution: when MySQL sets the following attributes, it uses streaming data reception method, receiving only part of the data from the server at a time until all the data is processed. After optimization, the historical data runs synchronously and stably, and the pressure on the MySQL side is also very small.

Summary

MySQL records data changes in the way of Binlog log, and the Change Data Caputre (CDC) mechanism based on streaming data implements LCSBinlog service.

This article mainly introduces the service architecture, application scenarios and practical experience of LCSBinlog in Xiaomi, and shares with you the problems and solutions we have encountered in practice, hoping to help you understand the principle of service and bring inspiration. You are also welcome to communicate with us.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.