Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Redhat 6 configures RHCS to realize dual-computer HA Cluster

2025-04-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/03 Report--

Recently, we tested RHCS on RedHat 6.5. we built a dual-computer HA cluster, where we shared the configuration process and testing process, including node configuration, cluster management server configuration, cluster creation and configuration, cluster testing and so on.

I. Test environment

Computer name

Operating system

IP address

Cluster IP

Installed software package

HAmanager

RedHat 6.5

192.168.10.150

-

Luci, iscsi target (for arbitration plate)

Node1

RedHat 6.5

192.168.10.104

192.168.10.103

High Availability 、 httpd

Node2

RedHat 6.5

192.168.10.105

High Availability 、 httpd

II. Node configuration

1. Configure hosts on the three machines to parse each other.

[root@HAmanager ~] # cat / etc/hosts

192.168.10.104 node1 node1.localdomain

192.168.10.105 node2 node2.localdomain

192.168.10.150 HAmanager HAmanager.localdomain

[root@node1 ~] # cat / etc/hosts

192.168.10.104 node1 node1.localdomain

192.168.10.105 node2 node2.localdomain

192.168.10.150 HAmanager HAmanager.localdomain

[root@node2 ~] # cat / etc/hosts

192.168.10.104 node1 node1.localdomain

192.168.10.105 node2 node2.localdomain

192.168.10.150 HAmanager HAmanager.localdomain

2. Configure SSH mutual trust on the three computers respectively.

[root@HAmanager] # ssh-keygen-t rsa

[root@HAmanager] # ssh-copy-id-I node1

[root@node1] # ssh-keygen-t rsa

[root@node1] # ssh-copy-id-I node2

[root@node2] # ssh-keygen-t rsa

[root@node2] # ssh-copy-id-I node1

3. Two nodes shut down NetworkManager and acpid services

[root@node1 ~] # service NetworkManager stop

[root@node1 ~] # chkconfig NetworkManager off

[root@node1 ~] # service acpid stop

[root@node1 ~] # chkconfig acpid off

[root@node2 ~] # service NetworkManager stop

[root@node2 ~] # chkconfig NetworkManager off

[root@node2 ~] # service acpid stop

[root@node2 ~] # chkconfig acpid off

4. Configure the local yum source on two nodes

[root@node1 ~] # cat/etc/yum.repos.d/rhel6.5.repo

[Server]

Name=base

Baseurl= file:///mnt/

Enabled=1

Gpgcheck=0

[HighAvailability]

Name=base

Baseurl= file:///mnt/HighAvailability

Enabled=1

Gpgcheck=0

[root@node2 ~] # cat/etc/yum.repos.d/rhel6.5.repo

[Server]

Name=base

Baseurl= file:///mnt/

Enabled=1

Gpgcheck=0

[HighAvailability]

Name=base

Baseurl= file:///mnt/HighAvailability

Enabled=1

Gpgcheck=0

5. Install the cluster package on both nodes

[root@node1 ~] # yum groupinstall 'High Availability'-y

Installed:

Ccs.x86_64 0VOR 0.16.2-69.el6 cman.x86_640:3.0.12.1-59.el6

Omping.x86_64 0RO 0.0.4-1.el6 rgmanager.x86_640:3.0.12.1-19.el6

Dependency Installed:

Cifs-utils.x86_640:4.8.1-19.el6 clusterlib.x86_64 0virtual 3.0.12.1-59.el6

Corosync.x86_640:1.4.1-17.el6 corosynclib.x86_64 0VLA 1.4.1-17.el6

Cyrus-sasl-md5.x86_640:2.1.23-13.el6_3.1 fence-agents.x86_64 0vl 3.1.5-35.el6

Fence-virt.x86_640:0.2.3-15.el6 gnutls-utils.x86_64 0VLA 2.8.5-10.el6_4.2

Ipmitool.x86_64 0VOR 1.8.11-16.el6 keyutils.x86_640:1.4-4.el6

Libevent.x86_640:1.4.13-4.el6 libgssglue.x86_64 0RU 0.1-11.el6

Libibverbs.x86_640:1.1.7-1.el6 librdmacm.x86_64 0purl 1.0.17-1.el6

Libtirpc.x86_640:0.2.1-6.el6_4 libvirt-client.x86_64 0purl 0.10.2-29.el6

Lm_sensors-libs.x86_640:3.1.1-17.el6 modcluster.x86_640:0.16.2-28.el6

Nc.x86_64 0vl 1.84-22.el6 net-snmp-libs.x86_64 1rig 5.5-49.el6

Net-snmp-utils.x86_641:5.5-49.el6 nfs-utils.x86_641:1.2.3-39.el6

Nfs-utils-lib.x86_640:1.1.5-6.el6 numactl.x86_640:2.0.7-8.el6

Oddjob.x86_64 0RO 0.30-5.el6 openais.x86_64 0RO 1.1.1-7.el6

Openaislib.x86_640:1.1.1-7.el6 perl-Net-Telnet.noarch 0RU 3.03-11.el6

Pexpect.noarch 0Rom 2.3-6.el6 python-suds.noarch0:0.4.1-3.el6

Quota.x86_64 1VR 3.17-20.el6 resource-agents.x86_640:3.9.2-40.el6

Ricci.x86_64 0VOR 0.16.2-69.el6 rpcbind.x86_640:0.2.0-11.el6

Sg3_utils.x86_64 0vl 1.28-5.el6 tcp_wrappers.x86_640:7.6-57.el6

Telnet.x86_64 1RO 0.17-47.el6_3.1 yajl.x86_64 0RU 1.0.7-3.el6

Complete!

[root@node2 ~] # yum groupinstall 'High Availability'-y

Installed:

Ccs.x86_64 0VOR 0.16.2-69.el6 cman.x86_640:3.0.12.1-59.el6

Omping.x86_64 0RO 0.0.4-1.el6 rgmanager.x86_640:3.0.12.1-19.el6

Dependency Installed:

Cifs-utils.x86_640:4.8.1-19.el6 clusterlib.x86_64 0virtual 3.0.12.1-59.el6

Corosync.x86_640:1.4.1-17.el6 corosynclib.x86_64 0VLA 1.4.1-17.el6

Cyrus-sasl-md5.x86_640:2.1.23-13.el6_3.1 fence-agents.x86_64 0vl 3.1.5-35.el6

Fence-virt.x86_640:0.2.3-15.el6 gnutls-utils.x86_64 0VLA 2.8.5-10.el6_4.2

Ipmitool.x86_64 0VOR 1.8.11-16.el6 keyutils.x86_640:1.4-4.el6

Libevent.x86_640:1.4.13-4.el6 libgssglue.x86_64 0RU 0.1-11.el6

Libibverbs.x86_640:1.1.7-1.el6 librdmacm.x86_64 0purl 1.0.17-1.el6

Libtirpc.x86_640:0.2.1-6.el6_4 libvirt-client.x86_64 0purl 0.10.2-29.el6

Lm_sensors-libs.x86_640:3.1.1-17.el6 modcluster.x86_640:0.16.2-28.el6

Nc.x86_64 0vl 1.84-22.el6 net-snmp-libs.x86_64 1rig 5.5-49.el6

Net-snmp-utils.x86_641:5.5-49.el6 nfs-utils.x86_641:1.2.3-39.el6

Nfs-utils-lib.x86_640:1.1.5-6.el6 numactl.x86_640:2.0.7-8.el6

Oddjob.x86_64 0RO 0.30-5.el6 openais.x86_64 0RO 1.1.1-7.el6

Openaislib.x86_640:1.1.1-7.el6 perl-Net-Telnet.noarch 0RU 3.03-11.el6

Pexpect.noarch 0Rom 2.3-6.el6 python-suds.noarch0:0.4.1-3.el6

Quota.x86_64 1VR 3.17-20.el6 resource-agents.x86_640:3.9.2-40.el6

Ricci.x86_64 0VOR 0.16.2-69.el6 rpcbind.x86_640:0.2.0-11.el6

Sg3_utils.x86_64 0vl 1.28-5.el6 tcp_wrappers.x86_640:7.6-57.el6

Telnet.x86_64 1RO 0.17-47.el6_3.1 yajl.x86_64 0RU 1.0.7-3.el6

Complete!

6. Two nodes start the cluster service respectively

[root@node1 ~] # service ricci start

[root@node1 ~] # chkconfig ricci on

[root@node1 ~] # chkconfig cman on

[root@node1 ~] # chkconfig rgmanager on

[root@node2 ~] # service ricci start

[root@node2 ~] # chkconfig ricci on

[root@node2 ~] # chkconfig cman on

[root@node2 ~] # chkconfig rgmanager on

7. Configure the ricci password on both nodes

[root@node1 ~] # passwd ricci

New password:

BAD PASSWORD: it is too short

BAD PASSWORD: is too simple

Retype new password:

Passwd: all authentication tokens updated successfully.

[root@node2 ~] # passwd ricci

New password:

BAD PASSWORD: it is too short

BAD PASSWORD: is too simple

Retype new password:

Passwd: all authentication tokens updated successfully.

8. Install the httpd service on both nodes to facilitate testing the high availability of the application later.

[root@node1 ~] # yum-y install httpd

[root@node1 ~] # echo "This is Node1" > / var/www/html/index.html

[root@node2 ~] # yum-y install httpd

[root@node2 ~] # echo "This is Node2" > / var/www/html/index.html

II. Configuration of cluster management server

1. Install the luci software package on the cluster management server

[root@HAmanager ~] # yum-y install luci

Installed:

Luci.x86_64 0RO 0.26.0-48.el6

Dependency Installed:

TurboGears2.noarch 0VOR 2.0.3-4.el6

Python-babel.noarch 0RO 0.9.4-5.1.el6

Python-beaker.noarch 0RV 1.3.1-7.el6

Python-cheetah.x86_64 0VOR 2.4.1-1.el6

Python-decorator.noarch 0RO 3.0.1-3.1.el6

Python-decoratortools.noarch 0RU 1.7-4.1.el6

Python-formencode.noarch 0RO 1.2.2-2.1.el6

Python-genshi.x86_64 0VOR 0.5.1-7.1.el6

Python-mako.noarch 0RO 0.3.4-1.el6

Python-markdown.noarch 0VOR 2.0.1-3.1.el6

Python-markupsafe.x86_64 0RO 0.9.2-4.el6

Python-myghty.noarch 0Rom 1.1-11.el6

Python-nose.noarch 0RO 0.10.4-3.1.el6

Python-paste.noarch 0RO 1.7.4-2.el6

Python-paste-deploy.noarch 0Plus 1.3.3-2.1.el6

Python-paste-script.noarch 0RV 1.7.3-5.el6_3

Python-peak-rules.noarch 0:0.5a1.dev-9.2582.1.el6

Python-peak-util-addons.noarch 0RH 0.6-4.1.el6

Python-peak-util-assembler.noarch 0VOR 0.5.1-1.el6

Python-peak-util-extremes.noarch 0Rom 1.1-4.1.el6

Python-peak-util-symbols.noarch 0RO 1.0-4.1.el6

Python-prioritized-methods.noarch 0RZ 0.2.1-5.1.el6

Python-pygments.noarch 0RO 1.1.1-1.el6

Python-pylons.noarch 0RO 0.9.7-2.el6

Python-repoze-tm2.noarch 0RO 1.0-0.5.a4.el6

Python-repoze-what.noarch 0RO 1.0.8-6.el6

Python-repoze-what-pylons.noarch 0RO 1.0-4.el6

Python-repoze-who.noarch 0VOR 1.0.18-1.el6

Python-repoze-who-friendlyform.noarch 0RO 1.0-0.3.b3.el6

Python-repoze-who-testutil.noarch 0RO 1.0-0.4.rc1.el6

Python-routes.noarch 0VOR 1.10.3-2.el6

Python-setuptools.noarch 0RO 0.6.10-3.el6

Python-sqlalchemy.noarch 0RO 0.5.5-3.el6_2

Python-tempita.noarch 0RO 0.4-2.el6

Python-toscawidgets.noarch 0RO 0.9.8-1.el6

Python-transaction.noarch 0RO 1.0.1-1.el6

Python-turbojson.noarch 0VR 1.2.1-8.1.el6

Python-weberror.noarch 0RO 0.10.2-2.el6

Python-webflash.noarch 0RO 0.1-0.2.a9.el6

Python-webhelpers.noarch 0RO 0.6.4-4.el6

Python-webob.noarch 0RO 0.9.6.1-3.el6

Python-webtest.noarch 0RO 1.2-2.el6

Python-zope-filesystem.x86_64 0:1-5.el6

Python-zope-interface.x86_64 0vl 3.5.2-2.1.el6

Python-zope-sqlalchemy.noarch 0RO 0.4-3.el6

Complete!

[root@HAmanager ~] #

2. Start the luci service

[root@HAmanager ~] # service luci start

Adding following auto-detected host IDs (IP addresses/domain names), corresponding to `HAmanager.localdomain' address, to the configuration of self-managed certificate `/ var/lib/luci/etc/cacert.config' (you can change them by editing` / var/lib/luci/etc/cacert.config', removing the generated certificate `/ var/lib/luci/certs/host.pem' and restarting luci):

(none suitable found, you can still do it manually as mentioned above)

Generating a 2048 bit RSA private key

Writing new private key to'/ var/lib/luci/certs/host.pem'

Starting saslauthd: [OK]

Start luci...

Point your web browser to https://HAmanager.localdomain:8084 (or equivalent) to access luci

[root@HAmanager ~] # chkconfig luci on

Jiang Jianlong's technology blog http://jiangjianlong.blog.51cto.com/3735273/1931499

III. Create and configure clusters

1. Use a browser to access the web management interface https://192.168.10.150:8084 of HA

2. Create a cluster and add nodes to the cluster

3. Add vCenter as fence device

4. Find the virtual machine UUID of the node

[root@node1] # fence_vmware_soap-a 192.168.10.91-z-l administrator@vsphere.local-p P@ssw0rd-o list

Node1564df192-7755-9cd6-8a8b-45d6d74eabbb

Node2564df4ed-cda1-6383-bbf5-f99807416184

5. Add fence method and instance to two nodes

6. Check the status of fence devices

[root@node1] # fence_vmware_soap-a 192.168.10.91-z-l administrator@vsphere.local-p P@ssw0rd-o status

Status: ON

7. Test fence equipment

[root@node2 ~] # fence_check

Fence_check run at Tue May 23 09:41:30 CST 2017 pid: 3455

Testing node1.localdomain method 1: success

Testing node2.localdomain method 1: success

8. Create a failure domain

9. Add cluster resources, add IP addresses and scripts for cluster resources, respectively

10. Create a cluster service group and add existing resources

11. Configure the arbitration disk, install the iSCSI target service on the HAmanager server and create a 100m shared disk for both nodes

[root@HAmanager ~] # yum install scsi-target-utils-y

[root@HAmanager ~] # dd if=/dev/zero of=/iSCSIdisk/100m.img bs=1M seek=100 count=0

[root@HAmanager ~] # vi / etc/tgt/targets.conf

Backing-store / iSCSIdisk/100m.img

Initiator-address 192.168.10.104 # for node1

Initiator-address 192.168.10.105 # for node2

[root@HAmanager ~] # service tgtd start

[root@HAmanager ~] # chkconfig tgtd on

[root@HAmanager ~] # tgt-admin-show

Target 1: iqn.2016-08.disk.rh7:disk100m

System information:

Driver: iscsi

State: ready

I_T nexus information:

LUN information:

LUN: 0

Type: controller

SCSI ID: IET 00010000

SCSI SN: beaf10

Size: 0 MB, Block size: 1

Online: Yes

Removable media: No

Prevent removal: No

Readonly: No

Backing store type: null

Backing store path: None

Backing store flags:

LUN: 1

Type: disk

SCSI ID: IET 00010001

SCSI SN: beaf11

Size: 105 MB, Block size: 512

Online: Yes

Removable media: No

Prevent removal: No

Readonly: No

Backing store type: rdwr

Backing store path: / sharedisk/100m.img

Backing store flags:

Account information:

ACL information:

192.168.10.104

192.168.10.105

[root@HAmanager ~] #

12. Both nodes install iscsi-initiator-utils and log in to the iscsi target

[root@node1 ~] # yum install iscsi-initiator-utils

[root@node1 ~] # chkconfig iscsid on

[root@node1] # iscsiadm-m discovery-t sendtargets-p 192.168.10.150

[root@node1] # iscsiadm-m node

[root@node1] # iscsiadm-m node-T iqn.2016-08.disk.rh7:disk100m-- login

[root@node2 ~] # yum install iscsi-initiator-utils

[root@node2 ~] # chkconfig iscsid on

[root@node2] # iscsiadm-m discovery-t sendtargets-p 192.168.10.150

[root@node2] # iscsiadm-m node

[root@node2] # iscsiadm-m node-T iqn.2016-08.disk.rh7:disk100m-- login

13. Create a partition sdb1 for shared disk / dev/sdb in Node 1

[root@node1 ~] # fdisk / dev/sdb

And then create it as sdb1

[root@node1 ~] # partprobe / dev/sdb1

14. Create the sdb1 as an arbitration disk in Node 1

[root@node1] # mkqdisk-c / dev/sdb1-l testqdisk

Mkqdisk v3.0.12.1

Writing new quorum disk label 'testqdisk' to / dev/sdb1.

WARNING: About to destroy all data on / dev/sdb1; proceed [N/y]? Y

Initializing status block for node 1...

Initializing status block for node 2...

Initializing status block for node 3...

Initializing status block for node 4...

Initializing status block for node 5...

Initializing status block for node 6...

Initializing status block for node 7...

Initializing status block for node 8...

Initializing status block for node 9...

Initializing status block for node 10...

Initializing status block for node 11...

Initializing status block for node 12...

Initializing status block for node 13...

Initializing status block for node 14...

Initializing status block for node 15...

Initializing status block for node 16...

[root@node1 ~] #

[root@node1] # mkqdisk-L

Mkqdisk v3.0.12.1

/ dev/block/8:17:

/ dev/disk/by-id/scsi-1IET_00010001-part1:

/ dev/disk/by-path/ip-192.168.10.150:3260-iscsi-iqn.2016-08.disk.rh7:disk100m-lun-1-part1:

/ dev/sdb1:

Magic: eb7a62c2

Label: testqdisk

Created: Mon May 22 22:52:01 2017

Host: node1.localdomain

Kernel Sector Size: 512

Recorded Sector Size: 512

[root@node1 ~] #

15. Check the arbitration disk in Node 2, and identify it normally.

[root@node2 ~] # partprobe / dev/sdb1

[root@node2] # mkqdisk-L

Mkqdisk v3.0.12.1

/ dev/block/8:17:

/ dev/disk/by-id/scsi-1IET_00010001-part1:

/ dev/disk/by-path/ip-192.168.10.150:3260-iscsi-iqn.2016-08.disk.rh7:disk100m-lun-1-part1:

/ dev/sdb1:

Magic: eb7a62c2

Label: testqdisk

Created: Mon May 22 22:52:01 2017

Host: node1.localdomain

Kernel Sector Size: 512

Recorded Sector Size: 512

16. Configure the cluster to use the arbitration disk

17. Restart the cluster to make the arbitration plate effective

[root@node1] # ccs-h node1-- stopall

Node1 password:

Stopped node2.localdomain

Stopped node1.localdomain

[root@node1] # ccs-h node1-- startall

Started node2.localdomain

Started node1.localdomain

[root@node1 ~] #

18. View cluster status

[root@node1 ~] # clustat

Cluster Status for TestCluster2 @ Mon May 22 23:48:27 2017

Member Status: Quorate

Member Name ID Status

Node1.localdomain 1 Online, Local, rgmanager

Node2.localdomain 2 Online, rgmanager

/ dev/block/8:17 0 Online, Quorum Disk

Service Name Owner (Last) State

-

Service:TestServGrp node1.localdomain started

[root@node1 ~] #

19. View the status of cluster nodes

[root@node1 ~] # ccs_tool lsnode

Cluster name: icpl_cluster, config_version: 21

Nodename Votes Nodeid Fencetype

Node1.localdomain 1 1 vcenter_fence

Node2.localdomain 1 2 vcenter_fence

20. View the synchronization status of cluster nodes

[root@node1] # ccs-h node1-- checkconf

All nodes in sync.

21. Use cluster IP to access web services

Jiang Jianlong's technology blog http://jiangjianlong.blog.51cto.com/3735273/1931499

IV. Cluster failover testing

1. Shut down the master node, and the automatic failover function is normal.

[root@node1 ~] # poweroff

[root@node1 ~] # tail-f / var/log/messages

May 23 10:29:26 node1 modclusterd: shutdown succeeded

May 23 10:29:26 node1 rgmanager [2125]: Shutting down

May 23 10:29:26 node1 rgmanager [2125]: Shutting down

May 23 10:29:26 node1 rgmanager [2125]: Stopping service service:TestServGrp

May 23 10:29:27 node1 rgmanager [2125]: [ip] Removing IPv4 address 192.168.10.103/24 from eth0

May 23 10:29:36 node1rgmanager [2125]: Service service:TestServGrp is stopped

May 23 10:29:36 node1 rgmanager [2125]: Disconnecting from CMAN

May 23 10:29:52 node1 rgmanager [2125]: Exiting

May 23 10:29:53 node1 ricci:shutdown succeeded

May 23 10:29:54 node1 oddjobd: oddjobd shutdown succeeded

May 23 10:29:54 node1 saslauthd [2315]: server_exit: master exited: 2315

[root@node2 ~] # tail-f / var/log/messages

May 23 10:29:45 node2 rgmanager [2130]: Member 1 shutting down

May 23 10:29:45 node2 rgmanager [2130]: Starting stopped service service:TestServGrp

May 23 10:29:45 node2 rgmanager [5688]: [ip] Adding IPv4 address 192.168.10.103/24 to eth0

May 23 10:29:49 node2 rgmanager [2130]: Service service:TestServGrp started

May 23 10:30:06 node2 qdiskd [1480]: Node 1 shutdown

May 23 10:30:06 node2 corosync [1437]: [QUORUM Members [1]: 2

May 23 10:30:06 node2 corosync [1437]: [TOTEM] A processor joined or left the membership

And a new membership was formed.

May 23 10:30:06 node2 corosync [1437]: [CPG] chosen downlist: sender r (0) ip (192.168.10.105):

Members (old:2 left:1)

May 23 10:30:06 node2 corosync [1437]: [MAIN] Completed service synchronization, ready to

Provide service

May 23 10:30:06 node2 kernel: dlm: closing connection to node 1

May 23 10:30:06 node2 qdiskd [1480]: Assuming master role

[root@node2 ~] # clustat

Cluster Status for TestCluster2 @ Mon May 22 23:48:27 2017

Member Status: Quorate

Member Name ID Status

Node1.localdomain 1 Online, Local, rgmanager

Node2.localdomain 2 Online, rgmanager

/ dev/block/8:17 0 Online, Quorum Disk

Service Name Owner (Last) State

-

Service:TestServGrp node2.localdomain started

[root@node2 ~] #

2. Stop the application service of the master node, and the automatic failover function is normal.

[root@node2 ~] # / etc/init.d/httpd stop

[root@node2 ~] # tail-f / var/log/messages

May 23 11:14:02 node2 rgmanager [11264]: [script] Executing / etc/init.d/httpd status

May 23 11:14:02 node2 rgmanager [11289]: [script] script:icpl: status of / etc/init.d/httpd failed (returned 3)

May 23 11:14:02 node2 rgmanager [2127]: status on script "httpd" returned 1 (generic error)

May 23 11:14:02 node2 rgmanager [2127]: Stopping service service:TestServGrp

May 23 11:14:03 node2 rgmanager [11320]: [script] Executing / etc/init.d/httpd stop

May 23 11:14:03 node2 rgmanager [11384]: [ip] Removing IPv4 address 192.168.10.103/24 from eth0

May 23 11:14:08 node2 ricci [11416]: Executing'/ usr/bin/virsh nodeinfo'

May 23 11:14:08 node2 ricci [11418]: Executing'/ usr/libexec/ricci/ricci-worker-f / var/lib/ricci/queue/2116732044'

May 23 11:14:09 node2 ricci [11422]: Executing'/ usr/libexec/ricci/ricci-worker-f / var/lib/ricci/queue/1193918332'

May 23 11:14:13 node2 rgmanager [2127]: Service service:TestServGrp is recovering

May 23 11:14:17 node2 rgmanager [2127]: Service service:TestServGrp is now running on member 1

[root@node1 ~] # tail-f / var/log/messages

May 23 11:14:20 node1 rgmanager [2130]: Recovering failed service service:TestServGrp

May 23 11:14:20 node1 rgmanager [13006]: [ip] Adding IPv4 address 192.168.10.103/24 to eth0

May 23 11:14:24 node1 rgmanager [13092]: [script] Executing / etc/init.d/httpd start

May 23 11:14:24 node1 rgmanager [2130]: Service service:TestServGrp started

May 23 11:14:58 node1 rgmanager [13280]: [script] Executing / etc/init.d/httpd status

[root@node1 ~] # clustat

Cluster Status for TestCluster2 @ Mon May 22 23:48:27 2017

Member Status: Quorate

Member Name ID Status

Node1.localdomain 1 Online, Local, rgmanager

Node2.localdomain 2 Online, rgmanager

/ dev/block/8:17 0 Online, Quorum Disk

Service Name Owner (Last) State

-

Service:TestServGrp node1.localdomain started

[root@node1 ~] #

3. Stop the network service of the master node, and the automatic failover function is normal.

[root@node1 ~] # service network stop

[root@node2 ~] # tail-f / var/log/messages

May 23 22:11:16 node2 qdiskd [1480]: Assuming master role

May 23 22:11:17 node2 qdiskd [1480]: Writing eviction notice for node 1

May 23 22:11:17 node2 corosync [1437]: [TOTEM] A processor failed, forming new configuration.

May 23 22:11:18 node2 qdiskd [1480]: Node 1 evicted

May 23 22:11:19 node2 corosync [1437]: [QUORUM] Members [1]: 2

May 23 22:11:19 node2 corosync [1437]: [TOTEM] A processor joined or left the membership and a new membership was formed.

May 23 22:11:19 node2 corosync [1437]: [CPG] chosen downlist: sender r (0) ip (192.168.10.105); members (old:2 left:1)

May 23 22:11:19 node2 corosync [1437]: [MAIN] Completed service synchronization, ready to provide service.

May 23 22:11:19 node2 kernel: dlm: closing connection to node 1

May 23 22:11:19 node2 rgmanager [2131]: State change: node1.localdomain DOWN

May 23 22:11:19 node2 fenced [1652]: fencing node1.localdomain

May 23 22:11:58 node2 fenced [1652]: fence node1.localdomain success

May 23 22:11:59 node2 rgmanager [2131]: Taking over service service:TestServGrp from down member node1.localdomain

May 23 22:11:59 node2 rgmanager [6145]: [ip] Adding IPv4 address 192.168.10.103/24 to eth0

May 23 22:12:03 node2 rgmanager [6234]: [script] Executing / etc/init.d/httpd start

May 23 22:12:03 node2 rgmanager [2131]: Service service:TestServGrp started

May 23 22:12:35 node2 corosync [1437]: [TOTEM] A processor joined or left the membership and a new membership was formed.

May 23 22:12:35 node2 corosync [1437]: [QUORUM] Members [2]: 12

May 23 22:12:35 node2 corosync [1437]: [QUORUM] Members [2]: 12

May 23 22:12:35 node2 corosync [1437]: [CPG] chosen downlist: sender r (0) ip (192.168.10.105); members (old:1 left:0)

May 23 22:12:35 node2 corosync [1437]: [MAIN] Completed service synchronization, ready to provide service.

May 23 22:12:41 node2 rgmanager [6425]: [script] Executing / etc/init.d/httpd status

May 23 22:12:43 node2 qdiskd [1480]: Node 1 shutdown

May 23 22:12:55 node2 kernel: dlm: got connection from 1

May 23 22:13:08 node2 rgmanager [2131]: State change: node1.localdomain UP

[root@node2 ~] # clustat

Cluster Status for TestCluster2 @ Mon May 22 23:48:27 2017

Member Status: Quorate

Member Name ID Status

Node1.localdomain 1 Online, Local, rgmanager

Node2.localdomain 2 Online, rgmanager

/ dev/block/8:17 0 Online, Quorum Disk

Service Name Owner (Last) State

-

Service:TestServGrp node2.localdomain started

[root@node2 ~] #

Attached: explanation of RHCS nouns

1 distributed Cluster Manager (CMAN,Cluster Manager)

Manage the cluster members and understand the running status between the members.

2 distributed Lock Manager (DLM,Distributed Lock Manager)

Each node runs a background process DLM, and when using memory to manipulate a metadata, it will inform other nodes that only this metadata can be read.

3 profile Management (CCS,Cluster Configuration System)

It is mainly used for cluster profile management and for configuration file synchronization. Each node runs the CSS background process. As soon as a change in the configuration file (/ etc/cluster/cluster.conf) is found, the change is propagated to other nodes.

4.fence device (fence)

How it works: when the host is abnormal, the server calls the fence device, and then restarts the abnormal host. When the operation of the fence device is successful, the information is returned to the standby. When the standby receives the message from the fence device, it takes over the services and resources of the host.

5. Conga cluster management software:

Conga consists of two parts: luci and ricci,luci are services running on the cluster management server, while ricci is a service running on each cluster node, and luci can also be installed on the node. The management and configuration of the cluster are communicated by the two services, and the Conga web interface can be used to manage the RHCS cluster.

6. High availability Service Management (rgmanager)

Provide node service monitoring and service failover functions, when one node service fails, transfer the service to another healthy node.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report