Corosync+pacemaker 's high-availability cluster 07/02 Update SLTechnology News&Howtos

Corosync+pacemaker 's high-availability cluster

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/03 Report--

High availability Cluster based on corosync+pacemaker

Pacemaker, namely Cluster Resource Manager (CRM for short), is used to manage the control center of the entire HA, and the client uses pacemaker to configure and manage and monitor the entire cluster. It can not provide the underlying heartbeat information transmission function, it needs to communicate with the other node with the help of the underlying (newly split heartbeat or corosync) heartbeat delivery service to announce the information to each other.

The tools for pacemaker to manage resources are managed by crmsh, pcs of the command line interface and pygui, hawk of the graphical interface.

Therefore, generally speaking, corosync is selected for heartbeat detection, and pacemaker's resource management system is used to build a high-availability system. Here's how to build a high-availability system with corosync+pacemaker.

There are two test nodes, server2 and server3, whose IP addresses are 172.25.80.2 and 172.25.80.3, respectively.

The simulated cluster service is the web service.

The address of providing web service is 172.25.80.1

1. Basic environment settings:

First of all, what you need to do to configure a HA host:

1) fixed IP address

2) the host names and corresponding IP address resolution services of all nodes can work properly, as long as you need to ensure that the / etc/hosts files on both nodes are as follows:

172.25.80.2 server2 node1

172.25.80.3 server3 node2

After the configuration of the above three steps is completed, the host names can be resolved to each other, as follows:

3) configure the trust relationship between nodes:

Node 1:

# ssh-keygen-t rsa

# ssh-copy-id-I ~ / .ssh/id_rsa.pub root@node2

Node 2:

# ssh-keygen-t rsa

# ssh-copy-id-I ~ / .ssh/id_rsa.pub root@node1

2. Install the corresponding software packages required by the environment:

# yum install-y libibverbs librdmacm lm_sensors libtool-ltdl openhpi-libs openhpiperl-TimeDate

3. Install corosync and pacemaker, and here we put the package in the / root/corosync directory (both nodes):

Cluster-glue-1.0.6-1.6.el5.i386.rpm

Cluster-glue-libs-1.0.6-1.6.el5.i386.rpm

Corosync-1.2.7-1.1.el5.i386.rpm

Corosynclib-1.2.7-1.1.el5.i386.rpm

Heartbeat-3.0.3-2.3.el5.i386.rpm

Heartbeat-libs-3.0.3-2.3.el5.i386.rpm

Libesmtp-1.0.4-5.el5.i386.rpm

Openais-1.1.3-1.6.el5.i386.rpm

Openaislib-1.1.3-1.6.el5.i386.rpm

Pacemaker-1.0.11-1.2.el5.i386.rpm

Pacemaker-libs-1.0.11-1.2.el5.i386.rpm

Perl-TimeDate-1.16-5.el5.noarch.rpm

Resource-agents-1.0.4-1.1.el5.i386.rpm

Start the installation:

# cd / root/corosync/

# yum-y-nogpgcheck localinstall * .rpm

Here we use the local yum installation and ignore the package check.

4Perfect corosync configuration (executed on node 1):

# cd / etc/corosync

# cp corosync.conf.example corosync.conf

Here corosync.conf.example is the configuration sample. We only need to make a copy and modify it:

Vim / etc/corosync/corosync.conf

Compatibility: whitetank

Totem {

Version: 2

Secauth: off

Threads: 0

Interface {

Ringnumber: 0

Bindnetaddr: 172.25.0.0

Mcastport: 5405

}

Logging {

Fileline: off

To_stderr: no

To_logfile: yes

To_syslog: yes

Logfile:/var/log/cluster/corosync.log / / where the logs are stored

Debug: off

Timestamp: on

Logger_subsys {

Subsys: AMF

Debug: off

}

Amf {

Mode: disabled

}

Service {

Ver: 0

Name: pacemaker

}

Aisexec {

User: root

Group: root

}

Generate the authentication key file used for communication between nodes:

# corosync-keygen / / an authentication key file will be generated under the current directory here

Then copy the relevant files to node 2:

# scp-p corosync.conf authkey node2:/etc/corosync/

Create the directory where the corosync-generated logs are located for both nodes:

# mkdir / var/log/cluster

# ssh node2-- mkdir / var/log/cluster

5, start corosync (execute on node 1):

# / etc/init.d/corosync start

Starting Corosync Cluster Engine (corosync): [OK] / / this indicates that corosync has been started

Start Node 2:

# ssh node2-/ etc/init.d/corosync start / / this step you need to do on Node 1

Starting Corosync Cluster Engine (corosync): [OK] / / this indicates that your node 2corosync has been started, and node 2 continues to verify whether an exception error has occurred

Online: [server2 server3] / / indicates that both of your cluster nodes are running normally.

6. Configure the working properties of the cluster:

Since corosync has enabled stonith by default and no stonith device has been added, the default configuration is not available. To prevent future errors and affect the operation, we can disable stonith here:

# crm configure property stonith-enabled=false / / commands executed in this way will be submitted and will take effect immediately

INFO: building help index

Crm,crm_verify-related commands are command-line-based cluster management tools provided by pacemaker after 1.0; it can be executed on any node in the cluster to view the corresponding information.

7. Add cluster resources to the cluster:

Corosync supports resource proxies such as heartbeat,lsb and ocf. At present, the more commonly used types are lsb and lsb, and the stonith class is designed to configure stonith devices.

8. Configure the trial page:

# echo "Server2" > / var/www/html/index.html

# chkconfig httpd off

# service httpd stop

# ssh node2-'echo "Server3" > / var/www/html/index.html'

# ssh node2-'chkconfig httpd off'

# ssh node2-'service httpd stop'

Add web resources:

The web cluster created first creates an IP address resource

# crm configure primitive WebIPocf:heartbeat:IPaddr paramsip=172.25.80.1

Then add the httpd service as a cluster resource. Adding httpd as a cluster resource has two resource agents available: lsb and ocf:heartbeat. For simplicity, the lsb type is used here:

# crm configure primitive WebSite lsb:httpd

You can access the web service by entering http://172.25.80.1 from the host's browser:

9, take Node 1 offline on Node 2:

# ssh node1-/ etc/init.d/corosync stop

# crm status

Stack: openais

Current DC: server3-partition WITHOUT quorum

Version: 1.0.11-1554a83db0d3c3e546cfd3aaff6af1184f79ee87

2 Nodes configured, 2 expected votes

1 Resources configured.

Online: [server3]

OFFLINE: [server2]

At this time, node 1 is offline, but node 2 cannot get the corresponding resources and cannot access the corresponding services, and the cluster status is "WITHOUT quorum". Without quorum, the nodes cannot get the corresponding resources, and the cluster services cannot operate normally. You can ignore quorum by setting quorum. The settings are as follows:

# crm configure property no-quorum-policy=ignore

# crm status

Stack: openais

Current DC: server3-partition WITHOUT quorum

Version: 1.0.11-1554a83db0d3c3e546cfd3aaff6af1184f79ee87

2 Nodes configured, 2 expected votes

1 Resources configured.

Online: [seerver3]

OFFLINE: [server2]

Resource Group: Web

WebIP (ocf::heartbeat:IPaddr): Started server3

WebSite (lsb:httpd): Started server3// Node 2 has acquired resources at this time

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.