Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to understand gh-ost

2025-03-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

How to understand ghost-ost, many novices are not very clear about this, in order to help you solve this problem, the following small series will explain in detail for everyone, people who have this need can learn, I hope you can gain something.

The online ddl tool that companies have been using internally is pt-online-schema-change, and today we're going to try another tool, ghost-ost.

GitHub,s Online Schema Transformer/Transfigurator/Transformer/Thingy stands for GitHub s Online Table Definition Converter.

The reason I'm interested in ghost is that ghost can be paused at any time. If the change process finds that the main library performance is affected, you can immediately stop pulling binlog, stop applying binlog, stop moving row data, and continue after the performance is stable; in addition, gh-ost does not rely on triggers, which is the biggest difference from pt-osc.

How it works:

Only introduced in master library mode:

1. First connect the ghost ost to the main database, and create the ghost table_tablename_ghost according to the alter statement; 2. Then connect the ghost ost to the main database as a backup database, copy the existing data from the main database to the ghost table, pull the binlog of incremental data from the main database, and then continuously apply the binlog back to the ghost table of the main database; 3. Wait for all data synchronization to complete, and perform cut-over, that is, switch between the ghost table and the original table. Cut-over is the last step, locking the source table of the main library, waiting for binlog to be applied, and then replacing the ghost table with the source table. During execution of ghost ost, hint and heartbeat packets will be added to the original binlog event to control the progress of the whole process, detect the status, etc. download and install

Download the latest binary package from github publishing address: github.com/github/gh-ost/releases

Unzip it into a ghost-ost binary file.

tar -xvf gh-ost-binary-linux-20190214020851.tar.gz cp gh-ost /usr/bin Common command combinations

Only introduced in the main library mode, the production can be directly used:

gh-ost \ --max-load=Threads_running=50 \ --critical-load=Threads_running=80 \ --critical-load-interval-millis=5000 \ --max-lag-millis=1500 \ --chunk-size=1000 \ --ok-to-drop-table \ --initially-drop-ghost-table \ --initially-drop-socket-file \ --host=172.18.6.2 \ --port=3306 \ --user="chenzhixin" \ --password="123456"\ --database="test" \ --table="user100w" \ --verbose \ --alter="add column test_field6 varchar(256) default '';" \ --cut-over-lock-timeout-seconds=1 \ --dml-batch-size=10 \ --exact-rowcount \ --serve-socket-file=/tmp/ghost.sock \ --panic-flag-file=/tmp/ghost.panic.flag \ --allow-on-master \ --execute ##########Special reminder!!!!############################ The following parameter needs to be added. If it is added, the source table and ghost table will not be automatically switched. The automatic switching will occur only after manually deleting the/tmp/ghost.postpone.flag file. Please understand the meaning of this parameter. --postpone-cut-over-flag-file=/tmp/ghost.postpone.flag

Two intermediate state tables_tablename_ghc and_tablename_gho are generated during the operation, where_tablename_ghc is the table that records the execution of the ghost ost, and_tablename_gho is the target table, i.e., the ghost table_tablename_gho to which ddl statements are applied.

During the execution of the above ghost-ost command, you will see the following execution process table of_user100w_ghc,_user100w_ghost table,

mysql> show tables;+----------------+| Tables_in_test |+----------------+| _user100w_ghc || _user100w_gho || fruits || test_timestamp || user100w |+----------------+5 rows in set (0.00 sec)

Explanation:

--max-load=Threads_running=50 If Threads_running=50 occurs during execution of ghost ost, execution of ghost ost is suspended--critical-load=Threads_running=80 indicates that Threads_running occurs during execution. If Threads_running reaches 80, execution of ghost ost is terminated--critical-load-interval-millis When the value is 0, ghost ost exits immediately when-critical-load is reached. When the value is not 0, when-critical-load is reached, ghost-ost will check again after-critical-load-interval-millis seconds. If-critical-load is still reached, ghost-ost will exit--max-lag-millis will monitor the master-slave delay of slave library. If the delay seconds exceed this threshold, row copy will not exit. Wait for the delay seconds to fall below this threshold to continue migration. -- ok-to-drop-table go-ost Whether to delete the old table after execution. Adding this parameter will delete the old table automatically. By default, the old table is not deleted, and there will be_tablename_del table--initially-drop-ghost-table gh-ost. Before execution, two xx_ghc and xx_gho tables will be created. If these two tables exist and this parameter is added, the original gh table will be deleted automatically and newly created, otherwise exit. xx_gho table is equivalent to full backup of old table, xx_ghc table data is data change log, understood as incremental backup. -- initially-drop-socket-file gh-ost will create socket file when executed, will not delete when exiting, error will be reported when executing gh-ost next time, add this parameter will delete the old socket file, recreate. -- throttle-flag-file Pause while this file exists, delete file continues. -- verbose execution process output log--chunk-size The migration process is completed step by step in batches. This parameter refers to the number of rows submitted each time. The default is 1000. -- max-lag-millis monitors slave latency. If the latency exceeds this threshold, migration will not exit. Wait for the latency to fall below this threshold to continue migration. -- max-lag-millis monitors the slave-slave latency, and if the latency seconds exceed this threshold, the migration does not exit, and waits for the latency seconds to fall below this threshold to continue the migration--throttle-control-replica in combination with the--max-lag-millis parameter, which specifies the database instance with slave-slave latency. -- cut-over-lock-timeout-seconds The maximum lock wait time of a ghost during a cut-over phase. When the lock times out, the cut-over of the ghost will retry. (Default: 3)--allow-on-master All migration operations are performed on the master repository--dml-batch-size Batch size of DML events applied in a single transaction (range 1-100)(default 10)--exact-rowcount Accurately count the number of rows in the table (use select count(*)) to get a more accurate estimated time. -- postpone-cut-over-flag-file While the file exists, the cut-over phase of the ghost will be deferred and the data will still be copied until the file is deleted. -- throttle-flag-file string The ghost-ost operation stops as soon as the file is created. This parameter is suitable for controlling a single ghost-ost operation. - throttle-additional-flag-file string is suitable for controlling multiple ghost-ost operations. Progress Tips

In the execution of ghost ost, there will be output information, which is explained as follows:

Copy: 27000/58707 46.0%;58707 is the total number of rows that need to be migrated, 27000 is the number of rows that have been migrated, and 46% is the percentage of migration completed. Applied: 0, the number of events processed in the binary log. In the example above, the migration table has no traffic and therefore no log events are processed. Backlog: 0/1000, which means we are doing well in reading binary logs and there are no backlogs in the binary log queue. Backlog: 7/1000, When copying rows, there is a backlog of events in the binary log and needs to be applied. Backlog: 1000/1000, indicating that our buffer of 1000 events is full (1000 event buffers written by the program are dead, and 100 in the lower version). At this time, note that the binlog write volume is very large, and the ghost-ost cannot handle the event. It may be necessary to suspend the binlog reading, and the event that needs to be applied first. streamer: shvm-5-39.000040:338912890; indicates that the process currently applied to binlog file location control ghost-ost

#First install socat tool

yum install socat -y

Note: The--serve-socket-file=/tmp/ghost.sock configured in the ghost ost above is used below.

Ghost host can listen to requests through unix socket files, DBA can change the corresponding parameters after the ghost host command runs to limit the current operation, etc.

#pause

echo throttle | socat - /tmp/ghost.sock

#Recovery

echo no-throttle | socat - /tmp/ghost.sock

#Terminate

Corresponding to panic-flag-file parameter file, when tmp directory exists this file immediately stop touch /tmp/ghost.panic.flag

#Delayed switching (cut-over phase)

--postpone-cut-over-flag-file=/tmp/ghost.postpone.flag When this parameter is set cut-over delays switching between source and ghost tables until you delete the file.

Applicable scenarios:

You initiate a modification operation and estimate the completion time to be 2 a.m., but you are very concerned about the final switch operation and want to watch it switch. What can you do? Only a flag file is needed to tell ghost to defer the switch, so ghost will only copy the data, but not switch tables. It will also continue to synchronize data, keeping ghost table data synchronized. When you get back to the office the next morning, delete the flag file or send the command echo unposttone to ghost-ost, and it will switch. We don't want software forcing us to watch it do things, it should free us up to do what humans do.

#Modify speed limit parameters online

echo max-load=Threads_connected=200 | socat - /tmp/ghost.sockecho max-load=Thread_running=3 | socat - /tmp/ghost.sockecho chunk-size=100 | socat - /tmp/ghost.sockecho max-lag-millis=200 | socat - /tmp/ghost.sock

Applicable scenarios:

After you have executed ghost ost, you may see that the load on the main library becomes higher, so you can issue a pause command. Generate a file with echo throttle to see if the load on the main library becomes normal again. Try these commands and you'll see how you can control their behavior, and your mind will settle down a lot. Ghost-Ost Limitations

There are advantages and disadvantages to everything. It is impossible for people to have all the good things. Here are a few restrictions on ghost-ost:

1. Tables without primary key or unique index are not supported.

2. Tables with foreign key constraints are not supported (primary tables and child tables are not supported)

3, do not support triggers on the table

5. When there are a lot of writes on the table, the ghost-ost may never be completed

Tested: When writing QPS 5000 or more, ghost ost cannot complete the task, the reason is that apply binlog is a single thread, which can be understood as slave. When the original table is written in a large amount (QPS=5000 or more), the log has been applied, and the ghost ost design is that binlog application priority is higher than row copy, so we see that the row copy progress has not changed. If the original table is always under such pressure, then ghost ost DDL will not be completed. After testing on the s3710 machine, if the QPS written in the original table is greater than 5000, this situation will occur with a high probability, and if it is less than 5000, there is no problem.

Did reading the above help you? If you still want to have further understanding of related knowledge or read more related articles, please pay attention to the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report