How to use DRBD in CentOS7.0 07/11 Update SLTechnology News&Howtos

How to use DRBD in CentOS7.0

2025-07-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article mainly introduces how to use DRBD in CentOS7.0, has a certain reference value, interested friends can refer to, I hope you can learn a lot after reading this article, the following let the editor take you to understand it.

A brief introduction to DRBD

The full name of DRBD is: Distributed ReplicatedBlock Device (DRBD) distributed block device replication. DRBD is composed of kernel modules and related scripts to build high-availability clusters. It is realized by mirroring the whole device through the network. You can think of it as a kind of network RAID. It allows the user to establish a real-time mirror of the local parcel device on the remote machine.

1.1.How does DRBD work?

(DRBD Primary) is responsible for receiving data, writing the data to the local disk and sending it to another host (DRBD Secondary). Another host saves the data to its own disk. Currently, DRBD allows read and write access to only one node at a time, but this is sufficient for a typical failover highly available cluster. It is possible that future versions will support read-write access by two nodes.

1.2.The relationship between DRBD and HA

A DRBD system consists of two nodes, similar to a HA cluster, but also divided into a primary node and a standby node. On nodes with primary devices, applications and operating systems can run and access DRBD devices (/ dev/drbd*). The data written by the primary node is stored in the disk device of the primary node through the DRBD device, and at the same time, the data is automatically sent to the corresponding DRBD device of the standby node, and finally written to the disk device of the standby node. On the standby node, DRBD only writes the data from the DRBD device to the disk of the standby node. Now most high-availability clusters use shared storage, and DRBD can also be used as a shared storage device, using DRBD does not require much hardware investment. Because it runs in a TCP/IP network, using DRBD as a shared storage device saves a lot of costs because the price is much cheaper than a dedicated storage network, and its performance and stability are also good

II. DRBD replication mode

2.1. Protocol A:

Asynchronous replication protocol. Once the local disk write has been completed and the packet is in the send queue, the write is considered complete. When a node fails, data loss may occur because the data written to the remote node may still be in the sending queue. Although the data on the failover node is consistent, it is not updated in a timely manner. This is usually used for geographically separate nodes

2.2. Protocol B:

Memory synchronous (semi-synchronous) replication protocol. Once the local disk write has been completed and the replication packet reaches the peer node, the write on the primary node is considered complete. Data loss may occur when two participating nodes fail at the same time, because the data in transfer may not be submitted to disk

2.3. Protocol C:

Synchronous replication protocol. A write is considered complete only if the disk of the local and remote node has confirmed that the write operation is complete. There is no data loss, so this is a popular model for cluster nodes, but I / O throughput depends on network bandwidth

Protocol C is generally used, but the choice of C protocol will affect the traffic and thus the network delay. For the sake of data reliability, we should carefully choose which protocol to use in the production environment.

3. Working principle diagram of DRBD

DRBD is a distributed storage system in the storage layer of Linux's kernel, which can be used to share block devices, file systems, and data between two Linux servers using DRBD. Similar to the function of a network RAID-1, as shown in the figure:

4. Installation configuration (operation on node 1)

4.1. Prepare:

Both nodes ha-node1 and ha-node2 follow the centos7.0 system, with two disks per node, one as the root partition and one as the drbd

192.168.8.51 ha-node1192.168.8.52 ha-node2

Modify the hostname:

Node 1

# hostnamectl set-hostname ha-node1# su-l

Node 2

# hostnamectl set-hostname ha-node2# su-l

4.2. The disk partition is as follows

[root@ha-node2 corosync] # lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTsda 8:0 0 20G 0 disk ├─ sda1 8:1 0 500M 0 part / boot └─ sda2 8:2 0 19.5G 0 part ├─ centos-swap 253:0 0 2G 0 lvm [SWAP] └─ centos-root 253:1 0 17.5G 0 lvm / sdb 8:16 0 20G 0 disk sr0 11:0 1 1024M 0 rom [root@ha-node1 corosync] # lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTsda 8:0 0 20G 0 disk ├─ sda1 8:1 0 500M 0 part / boot └─ sda2 8:2 0 19.5G 0 part ├─ centos-swap 253:0 0 2G 0 lvm [SWAP ] └─ centos-root 253:1 0 17.5G 0 lvm / sdb 8:16 0 20G 0 disk sr0 11:0 1 1024M 0 rom

4.3.Create lvm (each node needs to execute)

# pvcreate / dev/sdb# vgcreate data / dev/sdb# lvcreate-- size 2G-- name mysql data

4.4. Turn off the firewall (each node needs to execute)

Setenforce 0sed-i.bak "s/SELINUX=enforcing/SELINUX=permissive/g" >

4.5.Configuring hosts files

Echo '192.168.8.51 ha-node1' > > / etc/hostsecho '192.168.8.52 ha-node2' > > / etc/hosts

4.6.Configuring ntp (10.239.44.128 is the ntp server) each node needs to execute

# chkconfig chronyd off# chkconfig ntpd on # sed-I "/ ^ server\ 3.centos.pool/a server\ 10.239.44.128" / etc/ntp.conf # service ntpd start# ntpq-p

4.6. Configure mutual trust (each node needs to perform)

# ssh-keygen-t dsa-f ~ / .ssh/id_dsa-N "" # ssh-copy-id ha-node1# ssh-copy-id ha-node2

4.7. install drbd

# rpm-- import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org# rpm-Uvh http://www.elrepo.org/elrepo-release-7.0-2.el7.elrepo.noarch.rpm# yum install-y kmod-drbd84 drbd84-utils

4.8. Profile introduction

/ etc/drbd.conf # main configuration file

/ etc/drbd.d/global_common.conf # Global profile

A, / etc/drbd.conf description

The main configuration file contains global configuration files and files ending in .res in the "drbd.d/" directory

# You can find an examplein / usr/share/doc/drbd.../drbd.conf.exampleinclude "drbd.d/global_common.conf"; include "drbd.d/*.res"

B, / etc/drbd.d/global_common.conf description

Global {usage-count no; # whether to participate in DRBD usage statistics. Default is yes. Official statistics on the installed capacity of drbd are # minor-count dialog-refresh disable-ip-verification} common {protocol C; # using DRBD's synchronization protocol handlers {# These are EXAMPLE handlers only. # They may have severe implications, # like hard resetting the node under certain circumstances. # Be careful when chosing your poison. Pri-on-incon-degr "/ usr/lib/drbd/notify-pri-on-incon-degr.sh; / usr/lib/drbd/notify-emergency-reboot.sh; echo b > / proc/sysrq-trigger; reboot-f"; pri-lost-after-sb "/ usr/lib/drbd/notify-pri-lost-after-sb.sh; / usr/lib/drbd/notify-emergency-reboot.sh; echo b > / proc/sysrq-trigger; reboot-f" Local-io-error "/ usr/lib/drbd/notify-io-error.sh; / usr/lib/drbd/notify-emergency-shutdown.sh; echo o > / proc/sysrq-trigger; halt-f"; # fence-peer "/ usr/lib/drbd/crm-fence-peer.sh"; # split-brain "/ usr/lib/drbd/notify-split-brain.sh root" # out-of-sync "/ usr/lib/drbd/notify-out-of-sync.sh root"; # before-resync-target "/ usr/lib/drbd/snapshot-resync-target-lvm.sh-p 15-c 16k"; # after-resync-target / usr/lib/drbd/unsnapshot-resync-target-lvm.sh } startup {# wfc-timeout degr-wfc-timeout outdated-wfc-timeout wait-after-sb} options {# cpu-mask on-no-data-accessible} disk {on-io-error detach # configure size max-bio-bvecs on-io-error fencing disk-barrier disk-flushes O error handling policy to separate # size max-bio-bvecs on-io-error fencing disk-barrier disk-flushes # disk-drain md-flushes resync-rate resync-after al-extents # c-plan-ahead c-delay-target c-fill-target c-max-rate # c-min-rate disk-timeout} net {# protocol timeout max-epoch-size max-buffers unplug -watermark # connect-int ping-int sndbuf-size rcvbuf-size ko-count # allow-two-primaries cram-hmac-alg shared-secret after-sb-0pri # after-sb-1pri after-sb-2pri always-asbp rr-conflict # ping-timeout data-integrity-alg tcp-cork on-congestion # congestion-fill congestion-extents csums-alg verify-alg # use-rle} syncer {rate 1024M # set the network rate of synchronization between master and slave nodes}}

Note: on-io-error policy may be one of the following options

Detach detach: this is the default and recommended option, which will run the device in Diskless diskless mode if an underlying hard disk Imax O error occurs on the node

Pass_on:DRBD will report the Icano error to the upper layer, and on the primary node, it will report it to the mounted file system, but it will often ignore it on this node (so there is no upper layer to report on this node)

-local-in-error: invokes the command defined by the local disk Igamo handler; this requires a corresponding local-io-error-called resource handler to handle the error command; this gives the administrator sufficient freedom to command commands or scripts to call local-io-error to handle the Igamot error

Define a resource

C. Create / etc/drbd.d/MySQL.res and write

Resource mysql {# Resource name protocol C; # using the protocol meta-disk internal;device / dev/drbd1; # DRBD device name syncer {verify-alg sha1;# encryption algorithm} net {allow-two-primaries;} on ha-node1 {disk partition used by disk / dev/data/mysql; drbd1 is "mysql" address 192.168.51 dev/data/mysql; drbd1 7789; # set DRBD listening address and port} on ha-node2 {disk / dev/data/mysql;address 192.168.52 7789 }}

4.9. Copy the configuration file to node2

# scp-rp / etc/drbd.d/* ha-node2:/etc/drbd.d/

4.10. Enable drbd

# drbdadm create-md mysql# modprobe drbd# drbdadm up mysql# drbdadm-force primary mysql

View statu

# cat / proc/drbd

4.11. Configure the peer node

Ssh ha-node2 "drbdadm create-md mysql" ssh ha-node2 "modprobe drbd" ssh ha-node2 "drbdadm up mysql"

4.12. Format and mount the device

# mkfs.xfs / dev/drbd1# mount / dev/drbd1 / mnt

5. Related configuration operations

6.1. Detailed description of the connection status of resources

How do I check the resource connection status?

[root@ha-node1 ~] # drbdadm cstate mysql # mysql is the resource name WFConnection

The connection status of a resource; a resource may have one of the following connection states

StandAlone independent: the network configuration is not available; resources are not connected or managed disconnected (using the drbdadm disconnect command), or due to authentication failures or brain cleft conditions

Disconnecting disconnect: disconnect is only a temporary state, the next state is StandAlone independent

Unconnected dangling: is the temporary state before trying to connect, possibly the next state is WFconnection and WFReportParams

Timeout timeout: the connection with the peer node timed out, which is also a temporary state. The next state is Unconected suspended.

BrokerPipe: the connection with the peer node is lost, which is also a temporary state. The next state is Unconected suspended.

NetworkFailure: the temporary state after promoting the connection with the peer node. The next state is Unconected suspended.

ProtocolError: the temporary state after promoting the connection with the peer node. The next state is Unconected suspended.

TearDown disassembly: temporary state, peer node is closed, the next state is Unconected suspended

WFConnection: waiting for a network connection to be established with the peer node

WFReportParams: the TCP connection has been established, and this node is waiting for the first network packet from the peer node.

Connected connection: DRBD has established a connection, data mirroring is now available, and the node is in a normal state

StartingSyncS: full synchronization. Synchronization initiated by an administrator has just begun. The possible future status is SyncSource or PausedSyncS.

StartingSyncT: full synchronization. Synchronization initiated by an administrator has just started. The next status is WFSyncUUID.

WFBitMapS: partial synchronization has just begun, and the next possible status is SyncSource or PausedSyncS

WFBitMapT: partial synchronization has just begun, and the possible status for the next step is WFSyncUUID

WFSyncUUID: synchronization is about to begin. The next possible status is SyncTarget or PausedSyncT.

SyncSource: synchronization with this node as the synchronization source is in progress

SyncTarget: synchronization targeting this node is in progress

PausedSyncS: the local node is a source of continuous synchronization, but synchronization has been paused, probably because another synchronization is in progress or the synchronization has been paused using the command (drbdadm pause-sync)

PausedSyncT: the local node is the target of continuous synchronization, but synchronization has been paused. This can be because another synchronization is in progress or the synchronization has been paused using the command (drbdadm pause-sync).

VerifyS: online device verification with the local node as the verification source is being performed

VerifyT: online device verification targeting the local node is being performed

5.2. Resource roles

View Resource roles command

[root@ha-node] # drbdadm role mysqlSecondary/Secondary [root@ha-node1ha-node1 ~] # cat / proc/drbdversion: 8.4.3 (api:1/proto:86-101) GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by gardner@, 2013-05-27 04:30:21 0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:2103412

Note:

Parimary master: the resource is currently primary and may be being read or written, if not dual master will only appear on one of the two nodes

Secondary times: the resource is currently secondary and normally receives updates from peer nodes.

Unknown Unknown: resource role is currently unknown, local resources will not have this state

5.3. Hard disk status

View hard disk status command

[root@ha-node1ha-node1 ~] # drbdadm dstate mysqlInconsistent/Inconsistent

The hard drives of local and peer nodes may be in one of the following states:

Diskless no disk: no local block devices are assigned to DRBD, which means that there are no available devices, or manual detach using the drbdadm command or an underlying IBO error leads to automatic detach

Attaching: read the instantaneous state when there is no data

Failed failed: the local parcel device reports the next state of the Icano error, and its next status is Diskless without disk

Negotiating: the instant state before the connected DRBD is set for Attach reading without data

Inconsistent: the data is inconsistent, and a new resource is created as soon as this state occurs on both nodes (before the initial full synchronization). In addition, this state occurs on one node during synchronization (synchronization target)

Outdated: data resources are consistent, but out of date

DUnknown: this state occurs when the peer node network connection is not available

Consistent: a node without a connection has consistent data. When a connection is established, it determines whether the data is UpToDate or Outdated.

UpToDate: consistent and up-to-date data status, which is normal

5.4. Enable and disable resources

Manually enable resources

Manually enable resource drbdadm up manually disable resource drbdadm down

Note:

Resource: resource name; of course, you can also use all to indicate [deactivate | enable] all resources.

5.5. Upgrade and downgrade resources

Upgrade Resource drbdadm primary downgrade Resource drbdadm secondary

Note: in DRBD in single master mode, two nodes are connected at the same time, and any node can become master within a specific time; but only one of the two nodes can be primary. If there is already one master, you need to downgrade before upgrading. There is no such restriction in dual master mode.

5.6. Initialize device synchronization

Select an initial synchronization source; if it is newly initialized or empty, this choice can be arbitrary, but if one of the nodes is already in use and contains useful data, it is important to select the synchronization source; if you choose the wrong initialization synchronization direction, it will result in data loss, so you need to be very careful.

Initialize full synchronization, which can only be performed on one node that initializes the resource configuration and is selected as the synchronization source. The command is as follows:

[root@ha-node1 ~] # drbdadm-overwrite-data-of-peer primary mysql [root@ha-node1 ~] # cat / proc/drbd # View synchronization progress version: 8.4.3 (api:1/proto:86-101) GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by gardner@ 2013-05-27 04:30:21 0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent Crmurmuri n-ns:1897624 nr:0 dw:0 dr:1901216 al:0 bm:115 lo:0 pe:3 ua:3 ap:0 ep:1 wo:f oos:207988 [= >..] Synced: 90.3% (207988 PG 2103412) K finish: 0:00:07 speed: 26792 (27076) K/sec# when the synchronization is complete, it is as follows: version: 8.4.3 (api:1/proto:86-27076) GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by gardner@ 2013-05-27 04:30:21 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-ns:2103412 nr:0 dw:0 dr:2104084 al:0 bm:129 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

Note: drbd: for resource name

To view the progress of synchronization, you can also use the following command

Drbd-overview

5.7. Create a file system

The file system can only be mounted on the primary (Primary) node, so the DRBD device can not be formatted until the primary node is set up.

Format the file system

[root@ha-node1 ~] # mkfs.ext4 / dev/drbd1

Mount the file system

[root@ha-node1 ~] # mount / dev/drbd1 / mnt/

View mount

[root@ha-node1 ~] # mount | grep drbd1/dev/drbd1 on / mnt type ext4 (rw)

Note:

"/ dev/drbd1" defines the defined resource name in the resource

View DRBD status

[root@ha-node1] # drbd-overview 0:drbd/0 Connected Primary/Secondary UpToDate/UpToDate C r-

Note:

Primary: the current node is primary; the current node is in front of it.

Secondary: secondary standby node

5.8. Switch between active and standby nodes

First downgrade the current primary node to secondary

[root@ha-node1 ~] # drbdadm secondary mysql

View DRBD status

[root@ha-node1] # drbd-overview 0:drbd/0 Connected Secondary/Secondary UpToDate/UpToDate C r-

Upgrade at the HA-NODE2 node

[root@ha-node2 ~] # drbdadm primary mysql

View DRBD status

[root@ha-node2] # drbd-overview 0:drbd/0 Connected Primary/Secondary UpToDate/UpToDate C r-

5.9. Mount the device and verify that the file exists

[root@ha-node2 ~] # mount / dev/drbd1 / mnt/ [root@ha-node2 ~] # ls / mnt/lost+found test

VI. Simulation and repair of DRBD cerebral fissure.

Note: we also continue with the above experiment, now HA-NODE2 is the primary node and HA-NODE1 is the standby node

Disconnect the main (parmary) node

You can power off, disconnect the network, or reconfigure other IP; disconnect the network is chosen here

6.2. View the status of the two nodes

[root@ha-node2] # drbd-overview 0:drbd/0 WFConnection Primary/Unknown UpToDate/DUnknown C r-/ mnt ext4 2.0G 68M 1.9G 4 [root@ha-node1] # drbd-overview 0:drbd/0 StandAlone Secondary/Unknown UpToDate/DUnknown r-

From the above, you can see that the two nodes can no longer communicate; HA-NODE2 is the primary node, and HA-NODE1 is the standby node.

Upgrade the HA-NODE1 node to the primary node and mount the resources

[root@ha-node1 ~] # drbdadm primary mysql [root@ha-node1 ~] # drbd-overview 0:drbd/0 StandAlone Primary/Unknown UpToDate/DUnknown r-[root@ha-node1 ~] # mount / dev/drbd1 / mnt/ [root@ha-node1 ~] # mount | grep drbd1/dev/drbd1 on / mnt type ext4 (rw)

6.4.If the original primary node is repaired and back online, there will be a brain fissure.

[root@ha-node2] # tail-f / var/log/messagesSep 19 01:56:06 ha-node2 kernel: d-con drbd: Terminating drbd_a_drbdSep 19 01:56:06 ha-node2 kernel: block drbd1: helper command: / sbin/drbdadm initial-split-brain minor-0 exit code 0 (0x0) Sep 19 01:56:06 ha-node2 kernel: block drbd1: Split-Brain detected but unresolved Dropping connectionmakers Sep 19 01:56:06 ha-node2 kernel: block drbd1: helper command: / sbin/drbdadm split-brain minor-0Sep 19 01:56:06 ha-node2 kernel: block drbd1: helper command: / sbin/drbdadm split-brain minor-0 exit code 0 (0x0) Sep 19 01:56:06 ha-node2 kernel: d-con drbd: conn (NetworkFailure-> Disconnecting) Sep 19 01:56:06 ha-node2 kernel: d-con drbd: error receiving ReportState E:-5 l: 0mm Sep 19 01:56:06 ha-node2 kernel: d-con drbd: Connection closedSep 19 01:56:06 ha-node2 kernel: d-con drbd: conn (Disconnecting-> StandAlone) Sep 19 01:56:06 ha-node2 kernel: d-con drbd: receiver terminatedSep 19 01:56:06 ha-node2 kernel: d-con drbd: Terminating drbd_r_drbdSep 19 01:56:18 ha-node2 kernel: block drbd1: role (Primary-> Secondary)

6.5. Check the status of the two nodes again

[root@ha-node1 ~] # drbdadm role drbdPrimary/Unknown [root@ha-node2 ~] # drbdadm role mysqlPrimary/Unknown

6.6. Check the connection status between HA-NODE1 and HA-NODE2

Root@ha-node1 ~] # drbd-overview 0:mysql/0 StandAlone Primary/Unknown UpToDate/DUnknown r-/ mnt ext4 2.0G 68M 1.9G 4% [root@ha-node2 ~] # drbd-overview 0:mysql/0 WFConnection Primary/Unknown UpToDate/DUnknown C-/ mnt ext4 2.0G 68M 1.9G 4%

As can be seen from the above, when the status is StandAlone, the active and standby nodes will not communicate.

6.7. handling method at HA-NODE1 backup node

[root@ha-node1 ~] # umount / mnt/ [root@ha-node1 ~] # drbdadm disconnect drbddrbd: Failure: Invalid configuration requestadditional info from kernel:unknown connectionCommand 'drbdsetup disconnect ipv4:192.168.137.225:7789 ipv4:192.168.137.222:7789' terminated with exit code 10 [root@ha-node1 ~] # drbdadm secondary drbd [root@ha-node1 ~] # drbd-overview 0:drbd/0 StandAlone Secondary/Unknown UpToDate/DUnknown r-[root@ha-node1 ~] # drbdadm connect-- discard-my-data drbd

After performing the above three steps, you will find that it is still not available.

[root@ha-node1] # drbd-overview 0:drbd/0 WFConnection Secondary/Unknown UpToDate/DUnknown C r-

6.8. Connection resources need to be re-established on the HA-NODE2 node

[root@ha-node2 ~] # drbdadm connect drbd

View node connection status

[root@ha-node2] # drbd-overview 0:mysql/0 Connected Primary/Secondary UpToDate/UpToDate C r-/ mnt ext4 2.0G 68M 1.9G 4 [root@ha-node1] # drbd-overview 0:mysql/0 Connected Secondary/Primary UpToDate/UpToDate C r-Thank you for reading this article carefully. I hope the article "how to use DRBD in CentOS7.0" shared by the editor will be helpful to you. At the same time, I hope you will support it. Pay attention to the industry information channel, more related knowledge is waiting for you to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.