Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

MySQL online DDL & quot;gh-ost"

2025-10-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

Most of the changes to the DDL of a large table are pt-online-schema-change using Percona. This article explains the use of another tool, gh-ost: it does not rely on triggers because it obtains incremental changes from the library in row binlog through simulation, and then applies them asynchronously to the ghost table.

1. Download and install address:

Https://github.com/github/gh-ost/tags

2. Parameter description: gh-ost-help

Usage of gh-ost:

-- aliyun-rds: whether to execute on Ali Cloud Database. True

-- allow-master-master: whether to allow gh-ost to run in a dual-master replication architecture, which is generally used with the-assume-master-host parameter

-- allow-nullable-unique-key: the only key that allows gh-ost to depend on for data migration can be NULL, and the default is the only key that is not allowed to be NULL. If the unique key on which data migration (migrate) depends allows null values, the data may be incorrect and should be used with caution.

-- allow-on-master: allows gh-ost to run directly on the main library. The slave library of the default gh-ost connection.

-- alter string:DDL statement

-- approve-renamed-columns ALTER: if you change the name of a column, gh-ost will identify and need to provide a reason for renaming the column. By default, gh-ost does not continue unless-approve-renamed-columns ALTER is provided.

-- ask-pass:MySQL password

-- assume-master-host string: specify a main library for gh-ost in the format "ip:port" or "hostname:port". It is useful in this master architecture, or when gh-ost does not find the master.

-- assume-rbr: when confirming the binlog_format=ROW of the database instance to which gh-ost is connected, you can specify-assume-rbr, which prevents stop slave,start slave from running from the library, and users who execute gh-ost do not need SUPER permission.

-- check-flag

-- chunk-size int: the number of rows processed in each iteration (allowable range: 100-100000). The default value is 1000.

-- concurrent-rowcount: if this parameter is True (default), after row-copy, estimate the number of statistical rows (using explain select count (*) method), and adjust the ETA time, otherwise, gh-ost first estimates the number of statistical rows, and then starts row-copy.

-- the configuration file path of conf string:gh-ost.

-- critical-load string: a series of comma-separated status-name=values. When the status in the MySQL exceeds the corresponding values,gh-ost, it will exit. -critical-load Threads_connected=20,Connections=1500, which means that gh-ost will stop and exit due to the heavy load of the database when the state value Threads_connected > 20 received connections > 1500 in the MySQL.

Comma delimited status-name=threshold, same format as-- max-load. When status exceeds threshold, app panics and quits

-- critical-load-hibernate-seconds int: when the load reaches critical-load, gh-ost goes to sleep within a specified period of time. It does not read / write anything from any server.

-- critical-load-interval-millis int: when the value is 0, exit immediately when-critical-load,gh-ost is reached. When the value is not 0, when-critical-load,gh-ost is reached, it will be checked again after-critical-load-interval-millis seconds, and the check will still reach-critical-load,gh-ost will exit.

-- cut-over string: select the cut-over type: cut-over of type atomic/two-step,atomic (default) is github's algorithm, and two-step uses facebook-OSC 's algorithm.

-- cut-over-exponential-backoff

-- the maximum lock wait time for cut-over-lock-timeout-seconds int:gh-ost in the cut-over phase. When the lock times out, the cut-over of gh-ost will retry. (default: 3)

-- database string: database name.

-- debug:debug mode.

-- default-retries int: the number of times various operations are retried before panick. (default is 60)

-- discard-foreign-keys: this parameter is for a table with foreign keys. When gh-ost creates a ghost table, no foreign keys are created for the ghost table. This parameter is suitable for deleting foreign keys, but otherwise, use it with caution.

-- dml-batch-size int: batch size of DML events applied in a single transaction (range 1-100) (default is 10)

-- exact-rowcount: accurately count the number of rows in the table (using select count (*)) to get a more accurate estimate of the time.

-- execute: actually execute the alter&migrate table. Default is noop. Do not execute, just test and exit. If you want the ALTER TABLE statement to be implemented in the database, you need to specify-execute.

-- exponential-backoff-max-interval int

-- force-named-cut-over: if true, the 'unpostpone | cut-over' interactive command must name the migrated table

-- force-table-names string: table name prefix used on temporary tables

-- heartbeat-interval-millis int:gh-ost heartbeat rate. Default is 500.

-- help

-- hooks-hint string: any message is injected into the hook through GH_OST_HOOKS_HINT

-- hooks-path string:hook file storage directory (default is empty, that is, hook is disabled). Hook will look in this directory for hook files named in accordance with the convention to execute.

-- host string: MySQL IP/hostname

Check and delete the existing ghost table before the initially-drop-ghost-table:gh-ost operation. This parameter is not recommended. Please deal with the existing ghost table manually. This parameter is not enabled by default, and gh-ost exits the operation directly.

Check and delete old tables that already exist before the initially-drop-old-table:gh-ost operation. This parameter is not recommended. Please deal with the existing ghost table manually. This parameter is not enabled by default, and gh-ost exits the operation directly.

-- initially-drop-socket-file:gh-ost forces the deletion of existing socket files. This parameter is not recommended and may delete a running gh-ost program, causing DDL to fail.

-- master-password string: MySQL master password

-- master-user string:MysQL main account

-- max-lag-millis int: the maximum delay time of master-slave replication. When the master-slave replication delay time exceeds this value, gh-ost will take throttle measures. The default value is 1500s.

-- max-load string: comma separated status name = threshold, such as: 'Threads_running=100,Threads_connected=500'. When status exceeds threshold, app throttles writes

-- migrate-on-replica:gh-ost 's data migration (migrate) runs on the slave library, not the master library.

-- nice-ratio float: dormancy time for each chunk period, range [0.0 … 100.0] . 0: no hibernation for each chunk period, that is, one chunk followed by chunk execution; 1: for every 1 millisecond of row-copy, another 1 millisecond for dormancy; 0. 7: for every 10 milliseconds of row-copy, another 7 milliseconds for dormancy.

-- after the ok-to-drop-table:gh-ost operation ends, delete the old table. The default state is not to delete the old table, and there will be a _ tablename_ deltable.

-- panic-flag-file string: when this file is created, gh-ost will exit immediately.

-- password string: MySQL password

-- port int: MySQL port, preferably using slave library

-- postpone-cut-over-flag-file string: when this file exists, the cut-over phase of gh-ost will be delayed and the data will still be copied until the file is deleted.

-- quiet: silent mode.

-- replica-server-id uint: server_id of gh-ost

-- replication-lag-query string: deprecated

-- the absolute path to the socket file of serve-socket-file string:gh-ost.

-- serve-tcp-port int:gh-ost uses the port, which is closed by default.

-- skip-foreign-key-checks: when you are sure that there are no foreign keys on your table, set it to 'true', and want to skip the time of gh-ost validation-skip-renamed-columns ALTER

-- skip-renamed-columns ALTER: if you change the name of a column (such as change column), gh-ost will identify and need to provide a reason for renaming the column. Gh-ost does not continue by default. This parameter tells gh-ost to skip the data migration for the column, telling gh-ost to treat the renamed column as an insignificant column. This operation is dangerous and you will lose all the values of the column.

-- stack: add an error stack trace.

-- switch-to-rbr: have gh-ost automatically convert the binlog_format from the library to ROW format.

-- table string: table name

-- test-on-replica: test gh-ost on the slave library, including data migration from the library (migration). After the data migration is completed, stop slave, the original table and the ghost table are exchanged immediately and then exchanged back immediately. Keep stop slave so that you can compare the two tables.

-- test-on-replica-skip-replica-stop: when-test-on-replica is executed, this parameter indicates that stop slave is not used in the process.

-- throttle-additional-flag-file string: when the file is created, the gh-ost operation stops immediately. This parameter can be used to create a file when multiple gh-ost operations are operated at the same time, to stop all gh-ost operations, or to delete the file to restore all gh-ost operations.

-- throttle-control-replicas string: lists all slave libraries that need to be checked for master-slave replication delays.

-- throttle-flag-file string: when the file is created, the gh-ost operation stops immediately. This parameter is suitable for controlling a single gh-ost operation. -throttle-additional-flag-file string is suitable for controlling multiple gh-ost operations.

-- throttle-http string

-- throttle-query string: throttling query. It is executed every second. Throttling is not required when the return value is 0, and throttling operation is required when the return value is greater than 0. The query operates on the data Migration (migrated) server, so make sure that the query is lightweight.

-- timestamp-old-table: use a timestamp in the old table name. This results in a unique and conflict-free cross-migration of the old table name

-- tungsten: tell gh-ost you are running a tungsten-replication topology.

-- user string: MYSQL user

-- verbose

-- version

3. Instructions: the condition is that the binlog mode required on the MySQL of the operation is ROW. If you are testing on a top-to-bottom test, you must also be in ROW mode and turn on log_slave_updates. Adjust according to the requirements according to the above parameters.

4. Environment: master: 192.168.1.7 Slave: 192.168.1.8

DDL process:

① checks for foreign keys and triggers.

② checks the primary key information of the table.

③ checks whether the master library or slave library, whether log_slave_updates is enabled, and binlog information

④ checks the existence of temporary tables ending with gho and del

⑤ creates tables at the end of ghc, stores data migration information, and binlog information, etc.

-the above verification phase

⑥ initializes the connection to stream and adds binlog snooping

-the following migration phases

⑥ creates a temporary table at the end of gho, and executes DDL on the temporary table at the end of gho

⑦ starts the transaction, writes the source table data to the table at the end of the gho according to the primary key id, then commits, and binlog apply.

-the following cut-over phases

⑧ lock source table, rename table: rename source table to source _ del table, gho table to source table.

⑨ cleans up the ghc table.

one。 DDL on a single instance: a single instance is equivalent to the main database and needs to be enabled-- allow-on-master parameter and ROW mode.

Gh-ost-user= "root"-password= "root"-host=192.168.1.7-database= "test"-table= "T1"-alter= "ADD COLUMN cc2 varchar (10), add column cc3 int not null default 0 comment 'test'"-allow-on-master-- execute

two。 Master-slave DDL: there are two options: one is to perform synchronization to slave directly on the master according to 1, and the other is connected to the slave library and migrated in the master library (as long as the binlog of the slave database is ROW, the master database does not need to be guaranteed)

Gh-ost-user= "root"-password= "root"-host=192.168.1.8-database= "test"-table= "t"-initially-drop-old-table-alter= "ADD COLUMN y1 varchar (10), add column y2 int not null default 0 comment 'test'"-execute

The operation at this time is roughly as follows:

1. Row data is read and written on the main database

2. Read the binary log of the slave library and apply the changes to the master library

3. Collect table format, field & index, number of rows and other information from the library

4. Read internal change events (such as heartbeat events) from the library

5. Switch the table in the main database

In the execution of DDL, the slave library executes the stop/start slave once, and you can add the parameter:-- assume-rbr if it is determined that the slave binlog is ROW. If the binlog of the slave library is not ROW, you can convert it to ROW with the parameter-- switch-to-rbr. At this point, it is important to note that after execution, the binlog mode will not be converted to the original value. -- assume-rbr and-- switch-to-rbr parameters cannot be used together.

three。 Do DDL testing on the:

Gh-ost-user= "root"-password= "root"-host=192.168.1.8-database= "test"-table= "t"-alter= "ADD COLUMN abc1 varchar (10), add column abc2 int not null default 0 comment 'test'"-test-on-replica-switch-to-rbr-execute

Parameter-test-on-replica: test the gh-ost on the slave library, including data migration (migration) on the slave library. After the data migration is completed, stop slave, the original table and ghost table are exchanged immediately and then swapped back immediately. Keep stop slave so that you can compare the two tables. If you don't want to stop slave, you can add another parameter:-- test-on-replica-skip-replica-stop

The above three are gh-ost operation modes. In the above operation, the temporary table will not be cleaned up in the end and needs to be cleaned manually. If the temporary table still exists before the next execution, the execution will fail, and you can delete it through parameters:

1. Before the initially-drop-ghost-table:gh-ost operation, check and delete the existing ghost table. This parameter is not recommended. Please deal with the existing ghost table manually. This parameter is not enabled by default, and gh-ost exits the operation directly.

2. Before the initially-drop-old-table:gh-ost operation, check and delete the existing old table. This parameter is not recommended. Please deal with the existing ghost table manually. This parameter is not enabled by default, and gh-ost exits the operation directly.

3.-- initially-drop-socket-file:gh-ost forces the deletion of existing socket files. This parameter is not recommended. A running gh-ost program may be deleted, causing DDL to fail.

4. After the ok-to-drop-table:gh-ost operation ends, delete the old table. The default state is not to delete the old table, and there will be a _ tablename_ deltable.

There are other parameters, such as-- exact-rowcount,-- max-lag-millis,-- max-load and so on. You can see the above instructions. Most of the commonly used parameter commands are as follows:

Gh-osc-user=-password=-host=-database=-table=-max-load=Threads_running=30, chunk-size=1000 serve-socket-file=/tmp/gh-ost.test.sock exact-rowcount allow-on-master/--test-on-replica initially-drop-ghost-table/--initially-drop-old-table/--initially-drop-socket-file max-lag-millis= max-load='Threads_running=100,Threads_connected=500' ok-to-drop-table

four。 Additional instructions: termination, suspension, speed limit

Gh-ost-user= "root"-password= "root"-host=192.168.1.7-database= "test"-table= "T1"-alter= "ADD COLUMN O2 varchar (10) Add column o1 int not null default 0 comment 'test' "- exact-rowcount-- serve-socket-file=/tmp/gh-ost.t1.sock-- panic-flag-file=/tmp/gh-ost.panic.t1.flag-- postpone-cut-over-flag-file=/tmp/ghost.postpone.t1.flag-- allow-on-master-- execute

The ① marking file stops running:-- panic-flag-file

Create a file to stop running. In the example, create a / tmp/gh-ost.panic.t1.flag file to terminate the running gh-ost. Temporary file cleaning needs to be done manually.

② indicates that the file forbids cut-over, that is, table name switching is prohibited, and data replication occurs normally. -- postpone-cut-over-flag-file

The creation of files is delayed by cut-over, that is, the switching operation is postponed. When the / tmp/ghost.postpone.t1.flag file is created in the example, gh-ost will complete the row copy, but will not switch the table, it will continuously synchronize the data update operation of the original table to the temporary table.

③ uses socket to listen for requests, and the operator can change the corresponding parameters after the command runs. -- serve-socket-file,--serve-tcp-port (off by default)

Create a socket file to listen and adjust the parameters through the API. When the load and delay increase during the operation, you have to terminate the operation, reconfigure the parameters, such as chunk-size, and then re-execute the operation command, which can be adjusted dynamically through the scoket API. Such as:

Pause the operation:

# pause

Echo throttle | socat-/ tmp/gh-ost.test.t1.sock

# restore

Echo no-throttle | socat-/ tmp/gh-ost.test.t1.sock

Modify the speed limit parameters:

Echo chunk-size=100 | socat-/ tmp/gh-ost.t1.sock

Echo max-lag-millis=200 | socat-/ tmp/gh-ost.t1.sock

Echo max-load=Thread_running=3 | socat-/ tmp/gh-ost.t1.sock

Conclusion: although gh-ost does not need triggers and has much less impact on the pressure and performance of the main database, the efficiency of DDL for high concurrency scenarios is still lower than that of pt-osc, so it still needs to be handled at a low business peak.

Encapsulation script: gh-ost.py

#! / bin/env python

#-*-encoding: utf-8-*-

#-

# Purpose: gh-ost

# Created: 2018-06-16

#-

Import MySQLdb

Import re

Import sys

Import time

Import subprocess

Import os

From optparse import OptionParser

Def calc_time (func):

Def _ deco (* args, * * kwargs):

Begin_time = time.time ()

Func (* args, * * kwargs)

Cost_time = time.time ()-begin_time

Print 'cost time:% ss'% round (cost_time,2)

Return _ deco

Def get_table_count (conn,dbname,tbname):

Query =''SELECT count (*) FROM% s.% s''% (dbname,tbname)

Cursor = conn.cursor ()

Cursor.execute (query)

Row_nums = cursor.fetchone ()

Cursor.close ()

Conn.close ()

Return row_nums

Def online_ddl (conn,ddl_cmd):

Cursor = conn.cursor ()

Cursor.execute (ddl_cmd)

Conn.commit ()

Cursor.close ()

Conn.close ()

# @ calc_time

Def run_cmd (cmd):

P = subprocess.Popen (cmd, shell=True)

Return p,p.pid

Def drop_ghost_table (conn,ghost_name_list):

Try:

Cursor = conn.cursor ()

Query =''DROP TABLE IF EXISTS% s;''% (ghost_name_list)

Cursor.execute (query)

Conn.commit ()

Cursor.close ()

Conn.close ()

Except Exception,e:

Print e

If _ name__ = = "_ _ main__":

Parser = OptionParser ()

Parser.add_option ("- P", "--Port", help= "Port for search", dest= "port")

Parser.add_option ("- D", "--Dbname", help= "the Dbname to use", dest= "dbname")

Parser.add_option ("- T", "--Table", help= "the Table to use", dest= "tablename")

(options, args) = parser.parse_args ()

If not options.port:

Print 'params port need to apply'

Exit ()

If not options.dbname:

Print 'params dbname need to apply'

Exit ()

If not options.tablename:

Print 'params tablename need to apply'

Exit ()

Gh_ost_socket ='/ tmp/gh-ost.%s.%s.sock'% (options.dbname,options.tablename)

# termination Flag

Panic_flag ='/ tmp/gh-ost.panic.%s.%s.flag'% (options.dbname,options.tablename)

# postpone_flag ='/ tmp/gh-ost.postpone.%s.%s.flag'% (options.dbname,options.tablename)

# pause Flag

Throttle_flag ='/ tmp/gh-ost.throttle.%s.%s'% (options.dbname,options.tablename)

# socket ='/ data/%s/tmp/mysql.sock'% (options.port)

Socket ='/ var/run/mysqld/mysqld.sock'

Get_conn = MySQLdb.connect (host='192.168.163.131', port=int (options.port), user='root', passwd='root', db=options.dbname, unix_socket=socket,charset='utf8')

Conn = MySQLdb.connect (host='192.168.163.131', port=int (options.port), user='root', passwd='root', db=options.dbname, unix_socket=socket,charset='utf8')

(table_count,) = get_table_count (get_conn,options.dbname,options.tablename)

Print ("\ 033 [0trans32m% s\ 033 [quantity of 0m"% "table:% s"% table_count)

DDL_CMD = raw_input ('Enter DDL CMD:'). Replace ('','')

Gh_command_list = re.split ('[] +', DDL_CMD)

If gh_command_list [0] .upper () = 'CHANGE' and gh_command_list [1]! = gh_command_list [2]:

Print ("\ 033 [0trans31m% s\ 033 [0m"% "renamed columns' data will be lost,pt-osc exit...")

Exit ()

If table_count

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report