Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Mha Construction and matters needing attention

2025-01-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

Brief introduction:

MHA (Master High Availability) is currently a relatively mature solution for MySQL high availability. It was developed by youshimaton, a Japanese DeNA company (now working for Facebook). It is a set of excellent high availability software for failover and master-slave upgrade in MySQL high availability environment. In the process of MySQL failover, MHA can automatically complete the failover operation of the database within 30 seconds, and in the process of failover, MHA can ensure the consistency of the data to the maximum extent in order to achieve high availability in the real sense.

The software consists of two parts: MHA Manager (management node) and MHA Node (data node). MHA Manager can be deployed on a separate machine to manage multiple master-slave clusters, or it can be deployed on a slave node. MHA Node runs on each MySQL server, and MHA Manager regularly detects the master nodes in the cluster. When the master fails, it automatically promotes the slave of the latest data to the new master, and then redirects all other slave to the new master. The entire failover process is completely transparent to the application.

In the process of MHA automatic failover, MHA tries to save binary logs from the down primary server to ensure that the data is not lost as much as possible, but this is not always feasible. For example, if the primary server hardware fails or cannot be accessed through ssh, MHA cannot save binary logs and only fails over and loses the latest data. With semi-synchronous replication of MySQL 5.5, the risk of data loss can be greatly reduced. MHA can be combined with semi-synchronous replication. If only one slave has received the latest binary log, MHA can apply the latest binary log to all other slave servers, thus ensuring data consistency across all nodes.

At present, MHA mainly supports the architecture of one master and multiple slaves. To build MHA, there must be at least three database servers in a replication cluster. One master and two slaves, that is, one serves as master, one acts as standby master, and the other acts as slave library, because at least three servers are needed. Taobao has also carried out transformation on this basis for the consideration of machine cost. At present, Taobao TMHA already supports one master and one slave. In addition, for those who want to build quickly, please refer to: MHA Quick build

In fact, we can also use one master and one slave for our own use, but the master host cannot be switched after downtime, and the binlog cannot be completed. After the mysqld process of master crash, you can still switch successfully and complete the binlog.

Official introduction: https://code.google.com/p/mysql-master-ha/

Figure 01 shows how to manage multiple sets of master-slave replications through MHA Manager. You can summarize how MHA works as follows:

(figure 01)

(1) Save binary log events (binlog events) from crashed master

(2) identify the slave with the latest updates

(3) Relay logs (relay log) that apply differences to other slave

(4) apply binary log events saved from master (binlog events)

(5) upgrade a slave to a new master

(6) make other slave connect to the new master for replication

MHA software consists of two parts, Manager toolkit and Node toolkit, which are described in detail as follows.

The Manager toolkit mainly includes the following tools:

Masterha_check_ssh check MHA SSH configuration status masterha_check_repl check MySQL replication status masterha_manger launch MHAmasterha_check_status check current MHA operational status masterha_master_monitor detect master downtime masterha_master_switch control failover (automatic or manual) masterha_conf_host Add or remove configured server information

The ode toolkit (these tools are usually triggered by MHA Manager scripts and do not require human manipulation) mainly includes the following tools:

Save_binary_logs saves and replicates master's binary log apply _ diff_relay_logs to identify differential relay log events and apply them to other slavefilter_mysqlbinlog to remove unnecessary ROLLBACK events (MHA no longer uses this tool) purge_relay_logs clears the relay log (does not block SQL threads)

Environment:

System: CentOS Linux release 7.3.1611 (Core)

Mysql: 5.7.15-log

Mha: mha4mysql-manager-0.57.tar.gz mha4mysql-node-0.57.tar.gz

Role ip address hostname server_id type mha mangager 170.17.0.6 server01-Monitoring replication group Master 170.17.0.5 server02 1 write Candicate master 170.17.0.4 server03 2 read Candicate master 170.17.0.3 server03 3 read Slave 170.17.0.2 server05 4 read

Node component [root@53a15bac5d70 bin] # lltotal 44-rwxr-xr-x 1 1001 1001 16381 May 31 2015 apply_diff_relay_logs-rwxr-xr-x 1 1001 1001 4807 May 31 2015 filter_mysqlbinlog-rwxr-xr-x 1 1001 1001 8261 May 31 2015 purge_relay_logs-rwxr-xr-x 1 1001 1001 7525 May 31 2015 save_binary_ logs [root @ 53a15bac5d70 bin] #

Manger node

|-- bin

| |-- masterha_check_repl |

| |-- masterha_check_ssh |

| |-- masterha_check_status |

| |-- masterha_conf_host |

| |-- masterha_manager |

| |-- masterha_master_monitor |

| |-- masterha_master_switch |

| |-- masterha_secondary_check |

| | `--masterha_stop |

/ soft/mha4mysql-manager-0.57/samples/scripts

| |-- master_ip_failover (masterha_master_swith-- master_state=dead host down is lost, port 3306 is finished. This script will be enabled. The original script will be switched automatically without vip. If the host is unreachable and ssh is not available, use the original script. Otherwise, use the vip script you added). |

| |-- master_ip_online_change (host is available, 3306 can also use the corresponding masterha_master_swith-- master_state=alive) |

|-- power_manager

`--send_report

Masterha_master_switch-master_state=dead

-- global_conf=/etc/masterha_default.cnf

-conf=/usr/local/masterha/conf/app1.cnf-dead_master_host=host1

Call master_ip_failover

# For online master switch

Masterha_master_switch-master_state=alive

-- global_conf=/etc/masterha_default.cnf

-- conf=/usr/local/masterha/conf/app1.cnf

Call master_ip_online_change

See online reference

(http://code.google.com/p/mysql-master-ha/wiki/masterha_master_switch)

There are simple configuration files, switching, sending e-mail and other scripts in script.

Vim / etc/masterha/app01.conf

[server default]

# save binlog failover flag and so on

Manager_workdir=/var/log/masterha/app1

Manager_log=/var/log/masterha/app1/manager.log

User=root

Password=123456

Ssh_user=root

# # path of master binlog

Master_binlog_dir= / data/binlog/,/var/lib/mysql,/var/log/mysql

# scp the diff binlog to slave's path and save it

Remote_workdir=/tmp

Ping_interval=3

# mha will send ping packet to master per interval, no ping will failover

# ping_type='select'

# shutdown_script= / script/masterha/power_manager

Repl_user=repl

Repl_password=repl

# failover happen, send email to administrator

Report_script=/etc/masterha/script/send_report

Secondary_check_script= / usr/local/bin/masterha_secondary_check-s 172.17.0.2-s 172.17.0.3-s 172.17.0.4-s 172.17.0.5

# master is down failover host hangs up (in two cases, ssh can be connected, ssh can not be connected, optional configuration script, and change the following configuration according to the actual situation)

#

# master_ip_failover_script= / etc/masterha/script/master_ip_failover_vip

# master can reachable and down vip ssh can be connected and vip can be turned off by script

#

#

Master_ip_failover_script= / etc/masterha/script/master_ip_failover

# master can not reachable and not down vip ssh is not available, unable to turn off vip by script

#

# master online, mysql is alive and ssh is can reachable mysql is available (in two cases

Ssh reachable, ssh unreachable, optional configuration script, change the following configuration according to the actual situation)

#

# master can reachable and down vip

# master_ip_online_change_script= / etc/masterha/script/master_ip_online_change_vip

#

#

# master can not reachable and can't down vip

Master_ip_online_change_script= / etc/masterha/script/master_ip_online_change

#

# [server1]

# hostname=172.17.0.5

# candidate_master=1

# check_repl_delay=0

[server2]

Hostname=172.17.0.4

Candidate_master=1

Check_repl_delay=0

[server3]

Hostname=172.17.0.3

# candidate_master=1

[server4]

Hostname=172.17.0.2

No_master=1

Cat master_ip_failover

The added script is marked in red

#! / usr/bin/env perl

# Copyright (C) 2011 DeNA Co.,Ltd.

#

# This program is free software; you can redistribute it and/or modify

# it under the terms of the GNU General Public License as published by

# the Free Software Foundation; either version 2 of the License, or

# (at your option) any later version.

#

# This program is distributed in the hope that it will be useful

# but WITHOUT ANY WARRANTY; without even the implied warranty of

# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the

# GNU General Public License for more details.

#

# You should have received a copy of the GNU General Public License

# along with this program; if not, write to the Free Software

# Foundation, Inc.

# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA

# # Note: This is a sample script and is not complete. Modify the script based on your environment.

Use strict

Use warnings FATAL = > 'all'

Use Getopt::Long

Use MHA::DBHelper

My (

$command, $ssh_user, $orig_master_host

$orig_master_ip, $orig_master_port, $new_master_host

$new_master_ip, $new_master_port, $new_master_user

$new_master_password

);

GetOptions (

'command=s' = >\ $command

'ssh_user=s' = >\ $ssh_user

'orig_master_host=s' = >\ $orig_master_host

'orig_master_ip=s' = >\ $orig_master_ip

'orig_master_port=i' = >\ $orig_master_port

'new_master_host=s' = >\ $new_master_host

'new_master_ip=s' = >\ $new_master_ip

'new_master_port=i' = >\ $new_master_port

'new_master_user=s' = >\ $new_master_user

'new_master_password=s' = >\ $new_master_password

);

Exit & main ()

Sub main {

If ($command eq "stop" | | $command eq "stopssh") {

# $orig_master_host, $orig_master_ip, $orig_master_port are passed.

# If you manage master ip address at global catalog database

# invalidate orig_master_ip here.

My $exit_code = 1

Eval {

# updating global catalog, etc

$exit_code = 0

}

If ($@) {

Warn "Got Error: $@\ n"

Exit $exit_code

}

Exit $exit_code

}

Elsif ($command eq "start") {

# all arguments are passed.

# If you manage master ip address at global catalog database

# activate new_master_ip here.

# You can also grant write access (create user, set read_only=0, etc) here.

My $exit_code = 10

Eval {

My $new_master_handler = new MHA::DBHelper ()

# args: hostname, port, user, password, raise_error_or_not

$new_master_handler- > connect ($new_master_ip, $new_master_port

$new_master_user, $new_master_password, 1)

# # Set read_only=0 on the new master

$new_master_handler- > disable_log_bin_local ()

Print "Set read_only=0 on the new master.\ n"

$new_master_handler- > disable_read_only ()

# # Creating an app user on the new master

Print "Creating app user on the new master..\ n"

# FIXME_xxx_create_user ($new_master_handler- > {dbh})

$new_master_handler- > enable_log_bin_local ()

$new_master_handler- > disconnect ()

# # Update master ip on the catalog database, etc

# FIXME_xxx

$exit_code = 0

}

If ($@) {

Warn $@

# If you want to continue failover, exit 10.

Exit $exit_code

}

Exit $exit_code

}

Elsif ($command eq "status") {

# do nothing

Exit 0

}

Else {

& usage ()

Exit 1

}

}

Sub usage {

Print

"Usage: master_ip_failover-- command=start | stop | stopssh | status-- orig_master_host=host-- orig_master_ip=ip-- orig_master_port=port-- new_master_host=host-- new_master_ip=ip-- new_master_port=port\ n"

}

Cat master_ip_failover_vip

#! / usr/bin/env perl

# Copyright (C) 2011 DeNA Co.,Ltd.

#

# This program is free software; you can redistribute it and/or modify

# it under the terms of the GNU General Public License as published by

# the Free Software Foundation; either version 2 of the License, or

# (at your option) any later version.

#

# This program is distributed in the hope that it will be useful

# but WITHOUT ANY WARRANTY; without even the implied warranty of

# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the

# GNU General Public License for more details.

#

# You should have received a copy of the GNU General Public License

# along with this program; if not, write to the Free Software

# Foundation, Inc.

# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA

# # Note: This is a sample script and is not complete. Modify the script based on your environment.

Use strict

Use warnings FATAL = > 'all'

Use Getopt::Long

Use MHA::DBHelper

My (

$command, $ssh_user, $orig_master_host

$orig_master_ip, $orig_master_port, $new_master_host

$new_master_ip, $new_master_port, $new_master_user

$new_master_password

);

My $vip = '172.17.0.100Universe 24'

My $key ='1'

My $ssh_start_vip = "/ usr/sbin/ifconfig eth0:$key $vip"

My $ssh_stop_vip = "/ usr/sbin/ifconfig eth0:$key down"

GetOptions (

'command=s' = >\ $command

'ssh_user=s' = >\ $ssh_user

'orig_master_host=s' = >\ $orig_master_host

'orig_master_ip=s' = >\ $orig_master_ip

'orig_master_port=i' = >\ $orig_master_port

'new_master_host=s' = >\ $new_master_host

'new_master_ip=s' = >\ $new_master_ip

'new_master_port=i' = >\ $new_master_port

'new_master_user=s' = >\ $new_master_user

'new_master_password=s' = >\ $new_master_password

);

Exit & main ()

Sub main {

Print "\ n\ nIN SCRIPT TEST====$ssh_start_vip==$ssh_stop_vip==\ n\ n"

If ($command eq "stop" | | $command eq "stopssh") {

# $orig_master_host, $orig_master_ip, $orig_master_port are passed.

# If you manage master ip address at global catalog database

# invalidate orig_master_ip here.

My $exit_code = 1

Eval {

Print "Disabling the vip on old master:$orig_master_host\ n"

& stop_vip ()

# updating global catalog, etc

$exit_code = 0

}

If ($@) {

Warn "Got Error: $@\ n"

Exit $exit_code

}

Exit $exit_code

}

Elsif ($command eq "start") {

# all arguments are passed.

# If you manage master ip address at global catalog database

# activate new_master_ip here.

# You can also grant write access (create user, set read_only=0, etc) here.

My $exit_code = 10

Eval {

Print "Enabling the VIP-$vip on the new master-$new_master_host\ n"

& start_vip ()

My $new_master_handler = new MHA::DBHelper ()

# args: hostname, port, user, password, raise_error_or_not

$new_master_handler- > connect ($new_master_ip, $new_master_port

$new_master_user, $new_master_password, 1)

# # Set read_only=0 on the new master

$new_master_handler- > disable_log_bin_local ()

Print "Set read_only=0 on the new master.\ n"

$new_master_handler- > disable_read_only ()

# # Creating an app user on the new master

# print "Creating app user on the new master..\ n"

# FIXME_xxx_create_user ($new_master_handler- > {dbh})

$new_master_handler- > enable_log_bin_local ()

$new_master_handler- > disconnect ()

# # Update master ip on the catalog database, etc

# FIXME_xxx

$exit_code = 0

}

If ($@) {

Warn $@

# If you want to continue failover, exit 10.

Exit $exit_code

}

Exit $exit_code

}

Elsif ($command eq "status") {

Print "Checking the status of the script.. ok\ n"

`ssh $ssh_user\ @ $orig_master_host\ "$ssh_start_vip\" `

# do nothing

Exit 0

}

Else {

& usage ()

Exit 1

}

}

Sub start_vip () {

`ssh $ssh_user\ @ $new_master_host\ "$ssh_start_vip\" `

}

Sub stop_vip () {

`ssh $ssh_user\ @ $orig_master_host\ "$ssh_stop_vip\" `

}

Sub usage {

Print

"Usage: master_ip_failover-- command=start | stop | stopssh | status-- orig_master_host=host-- orig_master_ip=ip-- orig_master_port=port-- new_master_host=host-- new_master_ip=ip-- new_master_port=port\ n"

}

Cat master_ip_online_change (FIXME removed with a slight change in the original version)

#! / usr/bin/env perl

# Copyright (C) 2011 DeNA Co.,Ltd.

#

# This program is free software; you can redistribute it and/or modify

# it under the terms of the GNU General Public License as published by

# the Free Software Foundation; either version 2 of the License, or

# (at your option) any later version.

#

# This program is distributed in the hope that it will be useful

# but WITHOUT ANY WARRANTY; without even the implied warranty of

# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the

# GNU General Public License for more details.

#

# You should have received a copy of the GNU General Public License

# along with this program; if not, write to the Free Software

# Foundation, Inc.

# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA

# # Note: This is a sample script and is not complete. Modify the script based on your environment.

Use strict

Use warnings FATAL = > 'all'

Use Getopt::Long

Use MHA::DBHelper

Use MHA::NodeUtil

Use Time::HiRes qw (sleep gettimeofday tv_interval)

Use Data::Dumper

My $_ tstart

My $_ running_interval = 0.1

My (

$command, $orig_master_is_new_slave, $orig_master_host

$orig_master_ip, $orig_master_port, $orig_master_user

$orig_master_password, $orig_master_ssh_user, $new_master_host

$new_master_ip, $new_master_port, $new_master_user

$new_master_password, $new_master_ssh_user

);

GetOptions (

'command=s' = >\ $command

'orig_master_is_new_slave' = >\ $orig_master_is_new_slave

'orig_master_host=s' = >\ $orig_master_host

'orig_master_ip=s' = >\ $orig_master_ip

'orig_master_port=i' = >\ $orig_master_port

'orig_master_user=s' = >\ $orig_master_user

'orig_master_password=s' = >\ $orig_master_password

'orig_master_ssh_user=s' = >\ $orig_master_ssh_user

'new_master_host=s' = >\ $new_master_host

'new_master_ip=s' = >\ $new_master_ip

'new_master_port=i' = >\ $new_master_port

'new_master_user=s' = >\ $new_master_user

'new_master_password=s' = >\ $new_master_password

'new_master_ssh_user=s' = >\ $new_master_ssh_user

);

Exit & main ()

Sub current_time_us {

My ($sec, $microsec) = gettimeofday ()

My $curdate = localtime ($sec)

Return $curdate. "". Sprintf ("d", $microsec)

}

Sub sleep_until {

My $elapsed = tv_interval ($_ tstart)

If ($_ running_interval > $elapsed) {

Sleep ($_ running_interval-$elapsed)

}

}

Sub get_threads_util {

My $dbh = shift

My $my_connection_id = shift

My $running_time_threshold = shift

My $type = shift

$running_time_threshold = 0 unless ($running_time_threshold)

$type = 0 unless ($type)

My @ threads

My $sth = $dbh- > prepare ("SHOW PROCESSLIST")

$sth- > execute ()

While (my $ref = $sth- > fetchrow_hashref ()) {

My $id = $ref- > {Id}

My $user = $ref- > {User}

My $host = $ref- > {Host}

My $command = $ref- > {Command}

My $state = $ref- > {State}

My $query_time = $ref- > {Time}

My $info = $ref- > {Info}

$info = ~ s / ^\ s * (. *?)\ if defined ($info)

Next if ($my_connection_id = = $id)

Next if (defined ($query_time) & & $query_time

< $running_time_threshold ); next if ( defined($command) && $command eq "Binlog Dump" ); next if ( defined($user) && $user eq "system user" ); next if ( defined($command) && $command eq "Sleep" && defined($query_time) && $query_time >

= 1)

If ($type > = 1) {

Next if (defined ($command) & & $command eq "Sleep")

Next if (defined ($command) & & $command eq "Connect")

}

If ($type > = 2) {

Next if (defined ($info) & & $info = ~ m / ^ select / I)

Next if (defined ($info) & & $info = ~ m / ^ show / I)

}

Push @ threads, $ref

}

Return @ threads

}

Sub main {

If ($command eq "stop") {

# # Gracefully killing connections on the current master

# 1. Set read_only= 1 on the new master

# 2. DROP USER so that no app user can establish new connections

# 3. Set read_only= 1 on the current master

# 4. Kill current queries

# * Any database access failure will result in script die.

My $exit_code = 1

Eval {

# # Setting read_only=1 on the new master (to avoid accident)

My $new_master_handler = new MHA::DBHelper ()

# args: hostname, port, user, password, raise_error (die_on_error) _ or_not

$new_master_handler- > connect ($new_master_ip, $new_master_port

$new_master_user, $new_master_password, 1)

Print current_time_us (). "Set read_only on the new master.."

$new_master_handler- > enable_read_only ()

If ($new_master_handler- > is_read_only ()) {

Print "ok.\ n"

}

Else {

Die "Failed!\ n"

}

$new_master_handler- > disconnect ()

# Connecting to the orig master, die if any database error happens

My $orig_master_handler = new MHA::DBHelper ()

$orig_master_handler- > connect ($orig_master_ip, $orig_master_port

$orig_master_user, $orig_master_password, 1)

# # Drop application user so that nobody can connect. Disabling per-session binlog beforehand

$orig_master_handler- > disable_log_bin_local ()

Print current_time_us (). "Drpping app user on the orig master..\ n"

# FIXME_xxx_drop_app_user ($orig_master_handler)

# # Waiting for N * 100milliseconds so that current connections can exit

My $time_until_read_only = 15

$_ tstart = [gettimeofday]

My @ threads = get_threads_util ($orig_master_handler- > {dbh})

$orig_master_handler- > {connection_id})

While ($time_until_read_only > 0 & & $# threads > = 0) {

If ($time_until_read_only% 5 = = 0) {

Printf

"% s Waiting all running d threads are disconnected.. (max% d milliseconds)\ n"

Current_time_us (), $# threads + 1, $time_until_read_only * 100

If ($# threads

< 5 ) { print Data::Dumper->

New ([$_])-> Indent (0)-> Terse (1)-> Dump. "\ n"

Foreach (@ threads)

}

}

Sleep_until ()

$_ tstart = [gettimeofday]

$time_until_read_only--

@ threads = get_threads_util ($orig_master_handler- > {dbh})

$orig_master_handler- > {connection_id})

}

# # Setting read_only=1 on the current master so that nobody (except SUPER) can write

Print current_time_us (). "Set read_only=1 on the orig master.."

$orig_master_handler- > enable_read_only ()

If ($orig_master_handler- > is_read_only ()) {

Print "ok.\ n"

}

Else {

Die "Failed!\ n"

}

# # Waiting for M * 100milliseconds so that current update queries can complete

My $time_until_kill_threads = 5

@ threads = get_threads_util ($orig_master_handler- > {dbh})

$orig_master_handler- > {connection_id})

While ($time_until_kill_threads > 0 & & $# threads > = 0) {

If ($time_until_kill_threads% 5 = = 0) {

Printf

"% s Waiting all running d queries are disconnected.. (max% d milliseconds)\ n"

Current_time_us (), $# threads + 1, $time_until_kill_threads * 100

If ($# threads

< 5 ) { print Data::Dumper->

New ([$_])-> Indent (0)-> Terse (1)-> Dump. "\ n"

Foreach (@ threads)

}

}

Sleep_until ()

$_ tstart = [gettimeofday]

$time_until_kill_threads--

@ threads = get_threads_util ($orig_master_handler- > {dbh})

$orig_master_handler- > {connection_id})

}

# # Terminating all threads

Print current_time_us (). "Killing all application threads..\ n"

$orig_master_handler- > kill_threads (@ threads) if ($# threads > = 0)

Print current_time_us (). "done."

$orig_master_handler- > enable_log_bin_local ()

$orig_master_handler- > disconnect ()

# # After finishing the script, MHA executes FLUSH TABLES WITH READ LOCK

$exit_code = 0

}

If ($@) {

Warn "Got Error: $@\ n"

Exit $exit_code

}

Exit $exit_code

}

Elsif ($command eq "start") {

# # Activating master ip on the new master

# 1. Create app user with write privileges

# 2. Moving backup script if needed

# 3. Register new master's ip to the catalog database

# We don't return error even though activating updatable accounts/ip failed so that we don't interrupt slaves' recovery.

# If exit code is 0 or 10, MHA does not abort

My $exit_code = 10

Eval {

My $new_master_handler = new MHA::DBHelper ()

# args: hostname, port, user, password, raise_error_or_not

$new_master_handler- > connect ($new_master_ip, $new_master_port

$new_master_user, $new_master_password, 1)

# # Set read_only=0 on the new master

$new_master_handler- > disable_log_bin_local ()

Print current_time_us (). "Set read_only=0 on the new master."

$new_master_handler- > disable_read_only ()

# # Creating an app user on the new master

Print current_time_us (). "Creating app user on the new master..\ n"

# FIXME_xxx_create_app_user ($new_master_handler)

$new_master_handler- > enable_log_bin_local ()

$new_master_handler- > disconnect ()

# # Update master ip on the catalog database, etc

$exit_code = 0

}

If ($@) {

Warn "Got Error: $@\ n"

Exit $exit_code

}

Exit $exit_code

}

Elsif ($command eq "status") {

# do nothing

Exit 0

}

Else {

& usage ()

Exit 1

}

}

Sub usage {

Print

"Usage: master_ip_online_change-- command=start | stop | status-- orig_master_host=host-- orig_master_ip=ip-- orig_master_port=port-- new_master_host=host-- new_master_ip=ip-- new_master_port=port\ n"

Die

}

Cat master_ip_online_change_vip

#! / usr/bin/env perl

# Copyright (C) 2011 DeNA Co.,Ltd.

#

# This program is free software; you can redistribute it and/or modify

# it under the terms of the GNU General Public License as published by

# the Free Software Foundation; either version 2 of the License, or

# (at your option) any later version.

#

# This program is distributed in the hope that it will be useful

# but WITHOUT ANY WARRANTY; without even the implied warranty of

# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the

# GNU General Public License for more details.

#

# You should have received a copy of the GNU General Public License

# along with this program; if not, write to the Free Software

# Foundation, Inc.

# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA

# # Note: This is a sample script and is not complete. Modify the script based on your environment.

Use strict

Use warnings FATAL = > 'all'

Use Getopt::Long

Use MHA::DBHelper

Use MHA::NodeUtil

Use Time::HiRes qw (sleep gettimeofday tv_interval)

Use Data::Dumper

My $_ tstart

My $_ running_interval = 0.1

My (

$command, $orig_master_is_new_slave, $orig_master_host

$orig_master_ip, $orig_master_port, $orig_master_user

$orig_master_password, $orig_master_ssh_user, $new_master_host

$new_master_ip, $new_master_port, $new_master_user

$new_master_password, $new_master_ssh_user

);

# # #

My $vip = '172.17.0.100'

My $key = "1"

My $ssh_start_vip = "/ usr/sbin/ifconfig eth0:$key $vip"

My $ssh_stop_vip = "/ usr/sbin/ifconfig eth0:$key down"

# # #

GetOptions (

'command=s' = >\ $command

'orig_master_is_new_slave' = >\ $orig_master_is_new_slave

'orig_master_host=s' = >\ $orig_master_host

'orig_master_ip=s' = >\ $orig_master_ip

'orig_master_port=i' = >\ $orig_master_port

'orig_master_user=s' = >\ $orig_master_user

'orig_master_password=s' = >\ $orig_master_password

'orig_master_ssh_user=s' = >\ $orig_master_ssh_user

'new_master_host=s' = >\ $new_master_host

'new_master_ip=s' = >\ $new_master_ip

'new_master_port=i' = >\ $new_master_port

'new_master_user=s' = >\ $new_master_user

'new_master_password=s' = >\ $new_master_password

'new_master_ssh_user=s' = >\ $new_master_ssh_user

);

Exit & main ()

Sub current_time_us {

My ($sec, $microsec) = gettimeofday ()

My $curdate = localtime ($sec)

Return $curdate. "". Sprintf ("d", $microsec)

}

Sub sleep_until {

My $elapsed = tv_interval ($_ tstart)

If ($_ running_interval > $elapsed) {

Sleep ($_ running_interval-$elapsed)

}

}

Sub get_threads_util {

My $dbh = shift

My $my_connection_id = shift

My $running_time_threshold = shift

My $type = shift

$running_time_threshold = 0 unless ($running_time_threshold)

$type = 0 unless ($type)

My @ threads

My $sth = $dbh- > prepare ("SHOW PROCESSLIST")

$sth- > execute ()

While (my $ref = $sth- > fetchrow_hashref ()) {

My $id = $ref- > {Id}

My $user = $ref- > {User}

My $host = $ref- > {Host}

My $command = $ref- > {Command}

My $state = $ref- > {State}

My $query_time = $ref- > {Time}

My $info = $ref- > {Info}

$info = ~ s / ^\ s * (. *?)\ if defined ($info)

Next if ($my_connection_id = = $id)

Next if (defined ($query_time) & & $query_time

< $running_time_threshold ); next if ( defined($command) && $command eq "Binlog Dump" ); next if ( defined($user) && $user eq "system user" ); next if ( defined($command) && $command eq "Sleep" && defined($query_time) && $query_time >

= 1)

If ($type > = 1) {

Next if (defined ($command) & & $command eq "Sleep")

Next if (defined ($command) & & $command eq "Connect")

}

If ($type > = 2) {

Next if (defined ($info) & & $info = ~ m / ^ select / I)

Next if (defined ($info) & & $info = ~ m / ^ show / I)

}

Push @ threads, $ref

}

Return @ threads

}

Sub main {

If ($command eq "stop") {

# # Gracefully killing connections on the current master

# 1. Set read_only= 1 on the new master

# 2. DROP USER so that no app user can establish new connections

# 3. Set read_only= 1 on the current master

# 4. Kill current queries

# * Any database access failure will result in script die.

My $exit_code = 1

Eval {

# # Setting read_only=1 on the new master (to avoid accident)

My $new_master_handler = new MHA::DBHelper ()

# args: hostname, port, user, password, raise_error (die_on_error) _ or_not

$new_master_handler- > connect ($new_master_ip, $new_master_port

$new_master_user, $new_master_password, 1)

Print current_time_us (). "Set read_only on the new master.."

$new_master_handler- > enable_read_only ()

If ($new_master_handler- > is_read_only ()) {

Print "ok.\ n"

}

Else {

Die "Failed!\ n"

}

$new_master_handler- > disconnect ()

# Connecting to the orig master, die if any database error happens

My $orig_master_handler = new MHA::DBHelper ()

$orig_master_handler- > connect ($orig_master_ip, $orig_master_port

$orig_master_user, $orig_master_password, 1)

# # Drop application user so that nobody can connect. Disabling per-session binlog beforehand

$orig_master_handler- > disable_log_bin_local ()

Print current_time_us (). "Drpping app user on the orig master..\ n"

# # #

# FIXME_xxx_drop_app_user ($orig_master_handler)

# # #

# # Waiting for N * 100milliseconds so that current connections can exit

My $time_until_read_only = 15

$_ tstart = [gettimeofday]

My @ threads = get_threads_util ($orig_master_handler- > {dbh})

$orig_master_handler- > {connection_id})

While ($time_until_read_only > 0 & & $# threads > = 0) {

If ($time_until_read_only% 5 = = 0) {

Printf

"% s Waiting all running d threads are disconnected.. (max% d milliseconds)\ n"

Current_time_us (), $# threads + 1, $time_until_read_only * 100

If ($# threads

< 5 ) { print Data::Dumper->

New ([$_])-> Indent (0)-> Terse (1)-> Dump. "\ n"

Foreach (@ threads)

}

}

Sleep_until ()

$_ tstart = [gettimeofday]

$time_until_read_only--

@ threads = get_threads_util ($orig_master_handler- > {dbh})

$orig_master_handler- > {connection_id})

}

# # Setting read_only=1 on the current master so that nobody (except SUPER) can write

Print current_time_us (). "Set read_only=1 on the orig master.."

$orig_master_handler- > enable_read_only ()

If ($orig_master_handler- > is_read_only ()) {

Print "ok.\ n"

}

Else {

Die "Failed!\ n"

}

# # Waiting for M * 100milliseconds so that current update queries can complete

My $time_until_kill_threads = 5

@ threads = get_threads_util ($orig_master_handler- > {dbh})

$orig_master_handler- > {connection_id})

While ($time_until_kill_threads > 0 & & $# threads > = 0) {

If ($time_until_kill_threads% 5 = = 0) {

Printf

"% s Waiting all running d queries are disconnected.. (max% d milliseconds)\ n"

Current_time_us (), $# threads + 1, $time_until_kill_threads * 100

If ($# threads

< 5 ) { print Data::Dumper->

New ([$_])-> Indent (0)-> Terse (1)-> Dump. "\ n"

Foreach (@ threads)

}

}

Sleep_until ()

$_ tstart = [gettimeofday]

$time_until_kill_threads--

@ threads = get_threads_util ($orig_master_handler- > {dbh})

$orig_master_handler- > {connection_id})

}

#

Print "Disable the VIP on old master:$orig_master_host\ n"

& stop_vip ()

#

# # Terminating all threads

Print current_time_us (). "Killing all application threads..\ n"

$orig_master_handler- > kill_threads (@ threads) if ($# threads > = 0)

Print current_time_us (). "done."

$orig_master_handler- > enable_log_bin_local ()

$orig_master_handler- > disconnect ()

# # After finishing the script, MHA executes FLUSH TABLES WITH READ LOCK

$exit_code = 0

}

If ($@) {

Warn "Got Error: $@\ n"

Exit $exit_code

}

Exit $exit_code

}

Elsif ($command eq "start") {

# # Activating master ip on the new master

# 1. Create app user with write privileges

# 2. Moving backup script if needed

# 3. Register new master's ip to the catalog database

# We don't return error even though activating updatable accounts/ip failed so that we don't interrupt slaves' recovery.

# If exit code is 0 or 10, MHA does not abort

My $exit_code = 10

Eval {

My $new_master_handler = new MHA::DBHelper ()

# args: hostname, port, user, password, raise_error_or_not

$new_master_handler- > connect ($new_master_ip, $new_master_port

$new_master_user, $new_master_password, 1)

# # Set read_only=0 on the new master

$new_master_handler- > disable_log_bin_local ()

Print current_time_us (). "Set read_only=0 on the new master."

$new_master_handler- > disable_read_only ()

# # Creating an app user on the new master

Print current_time_us (). "Creating app user on the new master..\ n"

# # #

# FIXME_xxx_create_app_user ($new_master_handler)

# # #

$new_master_handler- > enable_log_bin_local ()

$new_master_handler- > disconnect ()

# # Update master ip on the catalog database, etc

# # #

Print "Enable the VIP: $vip on the new master host: $new_master_host\ n"

& start_vip ()

$exit_code = 0

# # #

}

If ($@) {

Warn "Got Error: $@\ n"

Exit $exit_code

}

Exit $exit_code

}

Elsif ($command eq "status") {

# do nothing

Exit 0

}

Else {

& usage ()

Exit 1

}

}

Sub stop_vip {

`ssh $orig_master_ssh_user\ @ $orig_master_host\ "$ssh_stop_vip\" `

}

Sub start_vip {

`ssh $new_master_ssh_user\ @ $new_master_host\ "$ssh_start_vip\" `

}

Sub usage {

Print

"Usage: master_ip_online_change-- command=start | stop | status-- orig_master_host=host-- orig_master_ip=ip-- orig_master_port=port-- new_master_host=host-- new_master_ip=ip-- new_master_port=port\ n"

Die

}

Fault handling:

Masterha_master_switch-conf=/etc/masterha/app1.cnf-dead_master_host=172.17.0.3-master_state=dead

Thu Sep 7 14:19:58 2017-[warning] SQL Thread is stopped (no error) on 172.17.0.5 (172.17.0.5 SQL Thread is stopped 3306)

Thu Sep 7 14:19:58 2017-[info] GTID failover mode = 0

Thu Sep 7 14:19:58 2017-[info] Dead Servers:

Thu Sep 7 14:19:58 2017-[info] 172.17.0.4 (172.17.0.4)

Thu Sep 7 14:19:58 2017-[info] 172.17.0.3 (172.17.0.3 3306)

Thu Sep 7 14:19:58 2017-[info] Checking master reachability via MySQL (double check).

Thu Sep 7 14:19:58 2017-[info] ok.

Thu Sep 7 14:19:58 2017-[info] Alive Servers:

Thu Sep 7 14:19:58 2017-[info] 172.17.0.5 (172.17.0.5 Thu Sep 3306)

Thu Sep 7 14:19:58 2017-[info] 172.17.0.2 (172.17.0.2 3306)

Thu Sep 7 14:19:58 2017-[info] Alive Slaves:

Thu Sep 7 14:19:58 2017-[info] 172.17.0.5 (172.17.0.5 Version=5.7.15-log) Version=5.7.15-log (oldest major version between slaves) log-bin:enabled

Thu Sep 7 14:19:58 2017-[info] Replicating from 172.17.0.3 (172.17.0.3 Replicating from 3306)

Thu Sep 7 14:19:58 2017-[info] Primary candidate for the new Master (candidate_master is set)

Thu Sep 7 14:19:58 2017-[info] 172.17.0.2 (172.17.0.2 Version=5.7.15-log (oldest major version between slaves) log-bin:enabled

Thu Sep 7 14:19:58 2017-[info] Replicating from 172.17.0.3 (172.17.0.3 Replicating from 3306)

Thu Sep 7 14:19:58 2017-[info] Not candidate for the new Master (no_master is set)

Thu Sep 7 14:19:58 2017-[error] [/ usr/local/share/perl5/MHA/ServerManager.pm, ln492] Server 172.17.0.4 (172.17.0.4) is dead, but must be alive! Check server settings.

Thu Sep 7 14:19:58 2017-[error] [/ usr/local/share/perl5/MHA/ManagerUtil.pm, ln177] Got ERROR: at / usr/local/share/perl5/MHA/MasterFailover.pm line 268.

This is because 172.17.0.4 is configured in the configuration file, and the server mysql is pawned. Modify the configuration file to block it.

[server1]

Hostname=172.17.0.5

Candidate_master=1

Check_repl_delay=0

# [server2]

# hostname=172.17.0.4

# candidate_master=1

# check_repl_delay=0

[server3]

Hostname=172.17.0.3

# candidate_master=1

[server4]

Hostname=172.17.0.2

No_master=1

Manually switch:

Masterha_master_switch-conf=/etc/masterha/app1.cnf-dead_master_host=172.17.0.3-master_state=dead

[root@f8dc93c1f02f script] # masterha_master_switch-conf=/etc/masterha/app1.cnf-dead_master_host=172.17.0.3-master_state=dead

-- dead_master_ip= is not set. Using 172.17.0.3.

-- dead_master_port= is not set. Using 3306.

Thu Sep 7 14:22:41 2017-[warning] Global configuration file / etc/masterha_default.cnf not found. Skipping.

Thu Sep 7 14:22:41 2017-[info] Reading application default configuration from / etc/masterha/app1.cnf..

Thu Sep 7 14:22:41 2017-[info] Reading server configuration from / etc/masterha/app1.cnf..

Thu Sep 7 14:22:41 2017-[info] MHA::MasterFailover version 0.57.

Thu Sep 7 14:22:41 2017-[info] Starting master failover.

Thu Sep 7 14:22:41 2017-[info]

Thu Sep 7 14:22:41 2017-[info] * Phase 1: Configuration Check Phase..

Thu Sep 7 14:22:41 2017-[info]

Thu Sep 7 14:22:42 2017-[warning] SQL Thread is stopped (no error) on 172.17.0.5 (172.17.0.5 SQL Thread is stopped 3306)

Thu Sep 7 14:22:42 2017-[info] GTID failover mode = 0

Thu Sep 7 14:22:42 2017-[info] Dead Servers:

Thu Sep 7 14:22:42 2017-[info] 172.17.0.3 (172.17.0.3 3306)

Thu Sep 7 14:22:42 2017-[info] Checking master reachability via MySQL (double check).

Thu Sep 7 14:22:42 2017-[info] ok.

Thu Sep 7 14:22:42 2017-[info] Alive Servers:

Thu Sep 7 14:22:42 2017-[info] 172.17.0.5 (172.17.0.5 Thu Sep 3306)

Thu Sep 7 14:22:42 2017-[info] 172.17.0.2 (172.17.0.2 3306)

Thu Sep 7 14:22:42 2017-[info] Alive Slaves:

Thu Sep 7 14:22:42 2017-[info] 172.17.0.5 (172.17.0.5 Version=5.7.15-log) Version=5.7.15-log (oldest major version between slaves) log-bin:enabled

Thu Sep 7 14:22:42 2017-[info] Replicating from 172.17.0.3 (172.17.0.3 Replicating from 3306)

Thu Sep 7 14:22:42 2017-[info] Primary candidate for the new Master (candidate_master is set)

Thu Sep 7 14:22:42 2017-[info] 172.17.0.2 (172.17.0.2 Version=5.7.15-log (oldest major version between slaves) log-bin:enabled

Thu Sep 7 14:22:42 2017-[info] Replicating from 172.17.0.3 (172.17.0.3 Replicating from 3306)

Thu Sep 7 14:22:42 2017-[info] Not candidate for the new Master (no_master is set)

Master 172.17.0.3 (172.17.0.3) is dead. Proceed? (yes/NO): yes

Thu Sep 7 14:22:51 2017-[error] [/ usr/local/share/perl5/MHA/MasterFailover.pm, ln309] Last failover was done at 2017-09-07 12:08:18. Current time is too early to do failover again. If you want to do failover, manually remove / var/log/masterha/app1/app1.failover.complete and run this script again.

Thu Sep 7 14:22:51 2017-[error] [/ usr/local/share/perl5/MHA/ManagerUtil.pm, ln177] Got ERROR: at / usr/local/bin/masterha_master_switch line 53.

[root@f8dc93c1f02f script] #

Error prompt, I have done failover before and left a poop.

Remove / var/log/masterha/app1/app1.failover.complete

Thu Sep 7 14:27:45 2017-[info] MHA::MasterFailover version 0.57.

Thu Sep 7 14:27:45 2017-[info] Starting master failover.

Thu Sep 7 14:27:45 2017-[info]

Thu Sep 7 14:27:45 2017-[info] * Phase 1: Configuration Check Phase..

Thu Sep 7 14:27:45 2017-[info]

Thu Sep 7 14:27:46 2017-[info] GTID failover mode = 0

Thu Sep 7 14:27:46 2017-[info] Dead Servers:

Thu Sep 7 14:27:46 2017-[info] 172.17.0.3 (172.17.0.3 3306)

Thu Sep 7 14:27:46 2017-[info] Checking master reachability via MySQL (double check).

Thu Sep 7 14:27:46 2017-[info] ok.

Thu Sep 7 14:27:46 2017-[info] Alive Servers:

Thu Sep 7 14:27:46 2017-[info] 172.17.0.5 (172.17.0.5 Thu Sep 3306)

Thu Sep 7 14:27:46 2017-[info] 172.17.0.2 (172.17.0.2 3306)

Thu Sep 7 14:27:46 2017-[info] Alive Slaves:

Thu Sep 7 14:27:46 2017-[info] 172.17.0.5 (172.17.0.5 Version=5.7.15-log) Version=5.7.15-log (oldest major version between slaves) log-bin:enabled

Thu Sep 7 14:27:46 2017-[info] Replicating from 172.17.0.3 (172.17.0.3 Replicating from 3306)

Thu Sep 7 14:27:46 2017-[info] Primary candidate for the new Master (candidate_master is set)

Thu Sep 7 14:27:46 2017-[info] 172.17.0.2 (172.17.0.2 Version=5.7.15-log (oldest major version between slaves) log-bin:enabled

Thu Sep 7 14:27:47 2017-[info] Replicating from 172.17.0.3 (172.17.0.3 Replicating from 3306)

Thu Sep 7 14:27:47 2017-[info] Not candidate for the new Master (no_master is set)

Thu Sep 7 14:27:47 2017-[error] [/ usr/local/share/perl5/MHA/MasterFailover.pm, ln281] Failover error flag file / var/log/masterha/app1/app1.failover.error exists. This means the last failover failed. Check error logs for detail, fix problems, remove / var/log/masterha/app1/app1.failover.error, and restart this script.

Thu Sep 7 14:27:47 2017-[error] [/ usr/local/share/perl5/MHA/ManagerUtil.pm, ln177] Got ERROR: at / usr/local/bin/masterha_master_switch line 53.

[root@f8dc93c1f02f script] #

Error message: last switch failed, clear / var/log/masterha/app1/app1.failover.error

Clean up relay log regularly

-- user mysql user name

-- password mysql password

-- port port number

-- workdir specifies the location where the hard link of the relay log is created. The default is / var/tmp. Since the creation of hard link files in different partitions of the system will fail, the specific location of the hard link needs to be executed. After the script is executed successfully, the hard link relay log file is deleted.

-- disable_relay_log_purge by default, if relay_log_purge=1, the script will clean up nothing and exit automatically. By setting this parameter, relay_log_purge will be set to 0 in the case of relay_log_purge=1. After cleaning up the relay log, finally set the parameter to OFF.

Cat purge_relay_log.sh

#! / bin/bash

User=root

Passwd=123456

Port=3306

Log_dir='/data/masterha/log'

Work_dir='/data'

Purge='/usr/local/bin/purge_relay_logs'

If [!-d $log_dir]

Then

Mkdir $log_dir-p

Fi

$purge-user=$user-- password=$passwd-- disable_relay_log_purge-- port=$port-- workdir=$work_dir > > $log_dir/purge_relay_logs.log 2 > & 1

Finally, start MHA Manger monitoring to see who is the master in the cluster (after switching, the monitoring stops. Is there something else you didn't get right? ) it became clear when I saw this sentence on the official website later.

Running MHA Manager from daemontoolsCurrently MHA Manager process does not run as a daemon. If failover completed successfully or the master process was killed by accident, the manager stops working. To run as a daemon, daemontool. Or any external daemon program can be used. Here is an example to run from daemontools.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report