Oracle 11.2.0.4 disable HAIP 04/12 Update SLTechnology News&Howtos

Oracle 11.2.0.4 disable HAIP

2025-04-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Phenomenon:

After the node is down, it cannot be restarted, so it needs to dial the heartbeat network card several times before it can start itself. It is preliminarily determined that due to the inexplicable failure of HAIP, a node cannot start CRS.

1 check the network

[grid@gmdb1 trace] $oifcfg iflist-p-n

Bond0 22.1.32.0 UNKNOWN 255.255.254.0

Bond1 1.255.255.0 UNKNOWN 255.255.255.0

Bond1 169.254.0.0 UNKNOWN 255.255.0.0

2 check CRS

[root@gmdb2 tmp] # crsctl check crs

CRS-4638: Oracle High Availability Services is online

CRS-4535: Cannot communicate with Cluster Ready Services

CRS-4530: Communications failure contacting Cluster Synchronization Services daemon

CRS-4534: Cannot communicate with Event Manager

3 check that ASM and HAIP cannot be started:

[root@gmdb2 tmp] # crsctl stat res-t-init

NAME TARGET STATE SERVER STATE_DETAILS Cluster Resources

Ora.asm 1 ONLINE OFFLINE

Ora.cluster_interconnect.haip 1 ONLINE OFFLINE

4 check with mcaasttest.pl, there is no problem:

[grid@gmdb2 mcasttest] $perl mcasttest.pl-n gmdb2,gmdb1-I bond0,bond1

# Setup for node gmdb2 #

Checking node access' gmdb2'

Checking node login 'gmdb2'

Checking/Creating Directory / tmp/mcasttest for binary on node 'gmdb2'

Distributing mcast2 binary to node 'gmdb2'

# Setup for node gmdb1 #

Checking node access' gmdb1'

Checking node login 'gmdb1'

Checking/Creating Directory / tmp/mcasttest for binary on node 'gmdb1'

Distributing mcast2 binary to node 'gmdb1'

# testing Multicast on all nodes #

Test for Multicast address 230.0.1.0

| 16:42:02 on November 28 | Multicast Succeeded for bond0 using address 230.0.1.0 purl 42000 |

November 28 16:42:03 | Multicast Succeeded for bond1 using address 230.0.1.0 purl 42001

Test for Multicast address 224.0.0.251

November 28 16:42:04 | Multicast Succeeded for bond0 using address 224.0.0.251purl 42002

November 28 16:42:05 | Multicast Succeeded for bond1 using address 224.0.0.251pur42003

5 check CSSD.LOG

2017-11-28 11 begin on node 4815 02.797: [CSSD] [2139567872] clssnmLocalJoinEvent: begin on node (2), waittime 193000

2017-11-28 11 set curtime 48 02.797: [CSSD] [2139567872] clssnmLocalJoinEvent: set curtime (1040905644) for my node

2017-11-28 11 scanning 48 02.797: [CSSD] [2139567872] clssnmLocalJoinEvent: scanning 32 nodes

2017-11-28 11 Node gmdb1 48 Node gmdb1 02.797: [CSSD] [2139567872] clssnmLocalJoinEvent: Node gmdb1, number 1, is in an existing cluster with disk state 3

2017-11-28 11 48 02.797: [CSSD] [2139567872] clssnmLocalJoinEvent: takeover aborted due to cluster member node found on disk

2017-11-28 11 node 4815 02.808: [CSSD] [2358462208] clssnmvDHBValidateNcopy: node 1, gmdb1, has a disk HB, but no network HB, DHB has rcfg 405549564, wrtcnt, 39931581, LATS 1040905654, lastSeqNo 39931578, uniqueness 1510056501, timestamp 15118408821783220964

2017-11-28 11 after CmInfo Stateval 48 after CmInfo Stateval 03.287: [CSSD] [2144298752] clssgmWaitOnEventValue: after CmInfo Stateval 3, eval 1 waited 0

2017-11-28 11 node 48 03.782: [CSSD] [2363209472] clssnmvDHBValidateNcopy: node 1, gmdb1, has a disk HB, but no network HB, DHB has rcfg 405549564, wrtcnt, 39931583, LATS 1040906624

There are a large number of records of no network heartbeat in the log.

Check

SQL > select * from v$cluster_interconnects

NAME IPADDRESS IS SOURCE

Eth2:1 169.254.134.65 NO

It is found that the HAIP is running, but the local HAIP cannot be started, so that the CSSD cannot be started. Check the dependency of CSSD:

[root@12crac2] # crsctl stat res ora.cluster_interconnect.haip-init-f

NAME=ora.cluster_interconnect.haip

TYPE=ora.haip.type

STATE=OFFLINE

TARGET=ONLINE

ACL=owner:root:rw-,pgrp:oinstall:rw-,other::r--,user:grid:r-x

ACTION_FAILURE_TEMPLATE=

ACTION_SCRIPT=

ACTIVE_PLACEMENT=0

AGENT_FILENAME=%CRS_HOME%/bin/orarootagent%CRS_EXE_SUFFIX%

AUTO_START=always

CARDINALITY=1

CARDINALITY_ID=0

CHECK_INTERVAL=30

CREATION_SEED=15

DEFAULT_TEMPLATE=

DEGREE=1

DESCRIPTION= "Resource type for a Highly Available network IP"

ENABLED=0

FAILOVER_DELAY=0

FAILURE_INTERVAL=0

FAILURE_THRESHOLD=0

HOSTING_MEMBERS=

ID=ora.cluster_interconnect.haip

LOAD=1

LOGGING_LEVEL=1

NOT_RESTARTING_TEMPLATE=

OFFLINE_CHECK_INTERVAL=0

PLACEMENT=balanced

PROFILE_CHANGE_TEMPLATE=

RESTART_ATTEMPTS=5

SCRIPT_TIMEOUT=60

SERVER_POOLS=

START_DEPENDENCIES=hard (ora.gpnpd,ora.cssd) pullup (ora.cssd)

Temporary solution:

In the case of determining that the heartbeat network is unable

Disable HAIP:

Crsctl modify res ora.cluster_interconnect.haip-attr "ENABLED=0"-init

Crsctl modify res ora.asm-attr "START_DEPENDENCIES='hard (ora.cssd,ora.ctssd) pullup (ora.cssd,ora.ctssd) weak (ora.drivers.acfs)', STOP_DEPENDENCIES='hard (intermediate:ora.cssd)'"-init

After the modification is complete, check again:

Related articles: on MOS

Known Issues: Grid Infrastructure Redundant Interconnect and ora.cluster_interconnect.haip (document ID 1640865.1)

BUG about HAIP on MOS

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.