What is the method of database migration 07/16 Update SLTechnology News&Howtos

What is the method of database migration

2025-07-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article introduces the relevant knowledge of "what is the method of database migration". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

How to deploy the downtime deployment method

The general idea is to hang a notice, downtime and upgrade in the middle of the night, then stop the service in the middle of the night, run the data migration program, and carry on the data migration.

The steps are as follows:

(1) make an announcement, such as "Tonight 00000000300 downtime maintenance, suspension of service"

(2) write a migration program, read the db-old database, and write to the new libraries db-new1 and db-new2 through middleware, as shown in the following figure

(3) verify the consistency before and after migration, and cut this part of the business to the new database if there is no problem.

By the way, popular science, this middleware. Now there are two kinds of popular middleware for sub-library and sub-table, one is in the form of proxy, such as mycat, which requires the deployment of an additional server. Another is in the form of client, such as Dangdang's Sharding-JDBC, which is a jar package that is very easy to use. I personally prefer Sharding-JDBC, which requires no additional deployment, no other dependence, and no need to change the original operation and maintenance mode of DBA.

Evaluation:

We should not think that this method is low. In fact, I have always thought that this method is very reliable. And I'm sure the company your readers work for is not a very awesome Internet company. If the number of active users of your product is more than 1000 at 1: 00 in the morning, hold your claws! After all, not everyone works in an e-commerce company, and most products don't have much traffic in the middle of the night. Therefore, this proposal is not without merit.

But this plan has a drawback, tired! Not only the body is tired, the heart is also tired! If you think about it, it was scheduled to end at six o'clock, and you migrated the database at five o'clock, but somehow, there was something wrong with the program cutting the new library. As a result, seeing that the day is about to dawn, we quickly switch the database back to the old library. It was physically and mentally exhausting to continue to do this the second night.

Ps: here are some skills to teach you. If you really don't overdo the inventory table and want to blow up your salary again, I suggest you answer this plan. Because this plan is so low,low that there is nothing to dig deep, it is more reliable to answer this plan.

In addition, if the interviewer's question is

How do you divide the database and table?

This question is very general, so to answer this question, it is recommended that you take the initiative to talk about the strategy of sub-table and how to deploy it. Because of this answer, it seems more rigorous.

However, many interviewers like to ask this question in order to show off their skills

What are the strategies of sub-table? Which one do you use?

Ok . . This question specifically points to a certain direction of the sub-database and sub-table, and you should not take the initiative to answer how to deploy. Wait for the interviewer to ask you before you answer. If the interviewer doesn't ask, the interviewer will ask you a few questions in the last part of the interview. Just ask.

You just mentioned the related problems of sub-database and sub-table. when we deployed at that time, we stopped the machine first. Then migrate the data in the middle of the night, and then cut the traffic to the new library the next day, this solution is too tiring, I wonder if your company has any better solutions?

In this case, the interviewer will have two answers. First, the interviewer is forced to pull casually. Second, the interviewer really did it and answered truthfully. Remember, it doesn't matter how the interviewer answers. The point is, if you ask this question, it will give the interviewer the wrong impression: "this guy really does too much inventory."

If you are worried about going in, what if you are really sent to do the sub-database table? OK, don't be afraid. I bet you won't get this job on probation. Because you can divide the database and table, you must be very familiar with the business. You are still on probation, you must not be familiar with the business, if the leader gives you this kind of job, I can only say that he has a big heart.

Ok, point it out here. The interview is originally a battle of wits and courage, which goes too far and comes back to our theme.

Double-write deployment method (1)

This is the non-downtime deployment method, and here I need to introduce two concepts: historical data and incremental data.

Suppose we split a table called test_tb, because you have to double write, and the business related to the test_tb table in the system must add a piece of double writing code to both the old library and the new library, and then deploy it.

Historical data: prior to this deployment, we called it historical data about the database table test_tb.

Incremental data: after this deployment, the newly generated data from the database table test_tb, which we call incremental data.

Then the migration process is as follows

(1) first calculate the max (primary key) of the table you want to migrate. During the migration process, only the test_tb table in db-old is migrated, and the primary key is smaller than the value of the max (primary key), which is the so-called historical data. Here is a special case. If your table uses uuid and cannot find the max (primary key), use the creation time as the basis for dividing historical data and incremental data. If your watch uses uuid and does not have the creation time field, I believe that the witty you must have a way to distinguish between historical data and incremental data.

(2) in the code, for services related to test_tb, add an additional code to send messages to the message queue, and send the operating sql to the message queue. As for how to assemble the message body, consider for yourself. It should be noted that only the sql of the write request is added, deleted and modified, only the sql of the write request is sent, and only the sql of the write request is sent. Important things are to be repeated for 3 times!

There are two reasons:

(1) only the sql of the write request is useful for recovering data.

(2) in the system, most of the business requirements are read requests, while write requests are relatively few. Note that at this stage, we do not consume the data in the message queue. We only send write requests, and the message accumulation in the message queue will not be too serious!

(3) the system is online. In addition, write a migration program to migrate the data in the test_tb table in db-old where the primary key is less than the max (primary key), that is, the so-called historical data.

The process of step (1) ~ step (3) above is as follows

Here you may have a question, in step (1) ~ step (3), will the system operate on the historical data, will it cause inconsistency?

OK, no. Here we analyze the delete operation and the update operation, because only these two operations will cause changes in the historical data, and the data entered by insert are incremental data.

(1) the delete operation is issued to the historical data of the test_ TB table in db-old, and the data is removed by the migration program before it is deleted. At this time, the delete operation is still recorded in the message queue, and the late subscriber subscribes to the delete operation and can delete it.

(2) the delete operation is issued on the historical data of the test_ TB table in db-old, and the data has been deleted, and the migration program cannot move the data. At this time, the delete operation is still recorded in the message queue, and the late subscriber subscribes to the delete operation and executes the delete again, which will not affect the consistency.

The operation for update is similar, so I won't go into detail.

Double-write deployment method (2)

There is a hard wound in the above method, notice that I have a sentence

Let's think about whether this has caused a serious code intrusion. Embed non-business code into business code, which is especially tiring when you delete the code later.

Is there any way to avoid this problem?

Yes, subscribe to binlog logs. With regard to the binlog log, I will try my best to write an article entitled "binlog knowledge that R & D should master" next week. I will introduce the role here.

Record all database table structure changes (e.g. CREATE, ALTER TABLE... ) and table data modification (INSERT, UPDATE, DELETE... The binary log of the Binlog does not record operations such as SELECT and SHOW because the operations themselves are not modified.

I still remember that the messages sent to the message queue, which we introduced in the double write deployment method, are all write operation messages. The binlog log also records write operations. So subscribing to the log can also meet our needs.

So the steps are as follows

(1) Open the binlog log, and the system will come online normally.

(2) write a migration program to migrate historical data. The steps are similar to the above, so don't be verbose.

Step (1) ~ step (2) the flow chart is as follows

How to check data consistency

Here is a brief introduction. The length of this article is too long. Just be sure of it.

(1) whether the prior quantity is consistent, because the prior quantity is faster.

As for verifying specific fields, there are two ways:

(2.1) one way is to verify that only a few key fields are consistent.

(2.2) another way is to take 50 pieces at a time (not necessarily 50, specific, I'll just give an example), and then spell it together like a string. Encrypt with md5 to get a string of values. The new library does the same, and also gets a string of values to compare whether the two strings are consistent. If consistent, continue to compare 50 pieces of data. If inconsistencies are found, dichotomy is used to determine whether the inconsistent data is 0-25 or 26-50. And so on, find out the inconsistent data and record it.

This is the end of the content of "what is the method of database migration". Thank you for your reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.