What is the architecture of oracle11g rac 02/14 Update SLTechnology News&Howtos

What is the architecture of oracle11g rac

2026-02-14 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article introduces the knowledge of "what is the architecture of oracle11g rac". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

1. Several major components of oracle rac 1.OHAS-OHAS is an important component of oracle11gr2, and it is the only starting point for cluster startup. Other daemons and resources managed by the cluster are defined as resources. At the same time, the cluster management software uses the agent process (agent) to manage 1.ocr and olr of all resources.

OCR is the registry that holds all the management resources of CRSD, but there are still many initialization resources that have not been started before crsd starts, so OLR is introduced in 11g version.

OLR is a local cluster registry that provides ohasd with configuration information for the cluster and information for initializing resources-(read the olr file from the / etc/oracle/olr.loc file when OHAS starts)

OLR can be backed up through ocrconfig-local-manualbackup

. / ocrconfig-local-restore for recovery

. / ocrcheck-local for consistency check

Agent process started by 2.ohasd

1. Oraagent-this process is started by Oracle or grid users and is responsible for managing the resources of oralce or grid users

Where ohasd only starts oraagent_grid-- including the resource is ora.gipcd ora.gpnpd ora.mdnsd ora.evmd ora.asm

Mdns-similar to dns, provides hostname to ip mapping (basic features: small VPC provides private name resolution service, uses multicast to send messages, UDP protocol, hostname ends with .local), and mainly provides resource discovery (resource discovery service) for gpnpd and Ohasd in rac.

Gpnpd-grid plug and play, which stores the basic information of the cluster locally and flexibly identifies other nodes in the cluster by communicating with mdnsd. Gpnp is divided into gpnp wallet (customer signature accessing profile) gpnp profile (saving the necessary information to start the cluster node) gpnp daemon (gpnp main thread, push thread, dispatching thread, dispatching thread) When the important configuration information of the cluster changes: the local dispatch thread notifies the remote node dispatch process-- > the local push process pushes gpnp profile to the remote process-- > the remote dispatch process receives a new version of gpnp pfofile when the cluster starts: the gpnp main thread accesses the gpnp profile and loads it into the cache to open it, such as loss. Then recover from olr-- > dispatch thread sends information to all remote nodes to confirm the location of the latest version of gpnp profile-- > the node with the latest version sends gpnp profile to that node-> the local node begins to provide services after receiving the gpnp profile.

Gipcd-- ensure the consistency of cluster network cards, not responsible for the transmission of information-- HAIP is responsible for the transmission of information (1. When the cluster starts, it is found that the private network card 2. Discover other nodes through the cluster private network and establish contact 3. If there are multiple network cards, when there is a problem with one of them, go offline and notify him of several other points, and vice versa)

Startup process-- "gipcd daemon startup--" attempt to access gpnpd to get remote node information and communicate with it-- "discover local private network information--" discover remote node-- > connection establishment.

2. Orarootagent-start as root user to manage the resources of root user

The resource started by ohasd is ora.diskmon ora.ctssd ora.crsd ora.driver.acfs ora.cluster_interconnect.haip ora.crf

Ctssd-> cluster time synchronize service synchronization time between nodes (the old version is prone to problems with the latter wts), working mode (select a reference node, other nodes refer to the time of this node) (if there are other time synchronization tools such as ntp (note that there is a configuration file will also think that there is a synchronization tool), then use watch mode to work, and the system time will not be modified in watch mode If there are no other time synchronization tools, working in active mode will slowly and automatically synchronize time information between nodes)

3. CssdagentMurmuri-responsible for starting the ocssd.bin process and then monitoring the ocssd.bin daemon

The ocssd daemon registers its status information with cssdagent and cssdmonitor every second and handles exceptions if there are any exceptions

4.cssdmonitor-only responsible for monitoring Ocssd.bin daemons

The ocssd daemon registers its status information with cssdagent and cssdmonitor every second and handles exceptions if there are any exceptions

3.HAIP and CHM

1.HAIP

For the Oracle database cluster, the communication between private networks is very important. The communication before the private network is mainly divided into the communication before the cluster and the communication before the database instance. The communication between the cluster is very simple, using a simple tcp/ip protocol, but the communication between the database instances is very large, and the real-time requirements of the data are also very high. Simple tcp/ip can not meet the need for UDP or RDS. High availability and load balancing need to be configured at the same time.

HAIP appeared because the high availability of the original database to the private network and the load balancing processing mostly depended on the linux bonding configured at the operating system level. In order to solve this problem, oracle database proposed HAIP.

What is a HAIP,HAIP database automatically binds a 169.254.room.* IP address on each private network card, this IP address is called HAIP, and the communication between Oracle database instances will communicate through this Ip address. When there is a problem with a private network card, the ip address of the private network will automatically drift to the normal private network network card, thus realizing the high availability of the private network.

2.CHM

Chm is a tool provided by oracle to collect statistics on operating system resources (cpu mem swap proc network), and chm exists on each node as an ora.crf process. Chm is mainly used to collect and prevent some rac node problems caused by system anomalies.

The comparison between chm and oswatcher-is more real-time, but the retention time is not as strong as oswatcher, and the function is not as good as oswatcher.

Chm composition

1.chm repository---- A berkeley database that stores the operating system statistics collected by each node and is stored in the name of the / gi_home/crf/db/ node. The default size is 1G, and the maximum retention time is 3 days.

2.system monitor service-exists in the form of osysmond.bin on each node and is responsible for collecting information from each node and sending it to the primary node server

3.cluster logger service exists in the form of daemon ologgerd on the primary node and the secondary node. The primary node is responsible for receiving the information of all nodes and recording it in the chm repository of the primary node, while the ologgerd of the secondary node receives the information from the primary node and records it in the chm repostiory of the secondary node.

The 2.css part is responsible for building the cluster and maintaining the cluster consistency.

CSS startup process-- Building a cluster

The 1.ohasd daemon starts and starts the corresponding agent process (including the oracssdagent_root of css)-- > 2. Oracssdagent_root starts the ocssd.bin process-- > 3. Ocssd.bin accesses gpnpd.bin to get the basic information of building a cluster, and access gipcd.bin to get the information of remote nodes-- > 4. Communicate with remote nodes, and get the number of local nodes by visiting VF and leasing fast, and join the cluster

Cluster heartbeat mechanism-maintaining cluster consistency

1. How to maintain the consistency of the cluster-> 1. Confirm the connectivity between nodes 2. The shared location saves node connectivity information 3. Local node self-monitoring

two。 Maintain three heartbeats of the cluster

3. Common terms

VD/VF saves the disk heartbeat information of each node, as well as the list of nodes that each node can see, determines the state of the node when the brain is split, and determines whether the node should leave the cluster or alive. At the same time, VF also retains the following information 1. Leased block lease block 2.kill block

OCR oracle cluster register records information related to crsd

Misscount network heartbeat timeout. The default is 30s, which is also the local heartbeat timeout.

LIOT long timeout disk heartbeat timeout. Default is 200s.

SIOT short iUnip o timeout node timeout for VF during reconfiguration. The default is misscount-reboottime=30-3pm 27s.

Reconfigure the primary node when the number of nodes in the cluster changes, one node will be used to complete the reconfiguration, usually the lowest node number will be the primary node

Reboot time cluster requires the restart time that OS completes. Default is 3s.

Diagwait specifies the margin time of the oprocd process

The latest status of the incarnation cluster. Each time the cluster is reconfigured, this value increases by 1.

1. Network heartbeat (miscount)-> sending thread analysis thread (processing based on received thread information, such as cluster reconfiguration) Cluster reconfigures thread dispatch thread (receives information from remote node and sends it to other threads according to information)

Used to confirm the connectivity before the cluster the ocssd process sends a network heartbeat to other nodes in the cluster every second through the private network

Loss of network heartbeat for reconfiguration process 1. When a node loses its network heartbeat for a period of time and exceeds the miscount value of the cluster, the analysis thread initiates the cluster reconfiguration 2. 5. Reconfigure the thread management node sends a reconfiguration message to all nodes in the cluster, and all nodes that receive this message return to their own state

3. Reconfigure the thread management node to confirm whether a brain fissure will occur according to the status-if the network heartbeat is abnormal but the disk heartbeat is normal, it is judged that the brain fissure will occur, kill block information will be written to the VD, and the node will be restarted when the node with abnormal heartbeat of the network reads it.

two。 Short heartbeat O timeout-> disk heartbeat thread-send disk heartbeat, read block kill disk heartbeat monitoring thread-detect disk heartbeat, read block kill block thread-monitor kill block

To confirm the solution for brain fissure when a cluster is split, oracle recommends setting an odd number of VF disks. When a node cannot access a certain VD, as long as the total number of disks that can be accessed is nmax 2 + 1.

3. Local heartbeat-- > Monitoring the status of cssd.bin processes and local nodes

When oracle sends a network heartbeat, it also sends a local heartbeat to cssdagent and cssdmonitor. If there is no problem with the local heartbeat, cssd.bin is considered normal-hang information is recorded in oprocd.

Group Management of Cluster

NM and GM refer to node management and group management, all of which are NM, followed by the main content of GM

The concept of GM-> 1. Group group A group of members and their resources as a whole. Member member an entity that can run independently (master member and shared member-- > master member is responsible for monitoring the cluster and acting accordingly when the change occurs, and the shared member is his extension) 3. GM master when group members change, master completes the reconfiguration

1. Share each group has several member books, each group or group members also need to share some information to the outside-1. The database in the cluster is registered on the css as a group, and the main member is the LMON process. When the LMON process starts, it needs to register on css before css knows the specific information of the database. This is the sharing of the 2.ASM process within the group will also register on the css, and the information will be shared out for the database instance to find, which is called inter-group sharing.

two。 Isolation when a member of a group leaves, gm needs to ensure that members of the group leave at the OS level, that is, all processes and io are cleaned

New features of 11G CSS

1. Member kill escalation member terminates upgrade

When a member needs to be terminated in rac, such as a two-node LMON process (a two-node database instance), the steps are as follows

Rac attempts to terminate LMON through GM. If the termination is completed, the information of termination is returned. If the termination fails, it is upgraded to NM level, and the node is restarted through NM to ensure the consistency of the cluster.

2.rebootless restart

When the following occurs, the cluster does not restart the node directly, but restarts the cluster management software.

1. When the node loses the network heartbeat exceeds misscount 2. When the node cannot access most of the VD 3.MEMBER KILL upgrades to NODE KILL, the above cannot terminate the process through GM.

Update this for now, and then update the crs section, asm section and the rest

That's all for "what is the architecture of oracle11g rac?" Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.