How to understand and master RAC 04/19 Update SLTechnology News&Howtos

How to understand and master RAC

2025-04-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article mainly introduces "how to understand and master RAC". In daily operation, I believe many people have doubts about how to understand and master RAC. The editor consulted all kinds of materials and sorted out simple and easy-to-use methods of operation. I hope it will be helpful for you to answer the doubts of "how to understand and master RAC"! Next, please follow the editor to study!

If you understand that the scn in a single sequence shared by redo and redo is not contiguous, you will understand why RAC-to-RAC recovery or RAC-to-stand-alone recovery is usually recover to a certain scn or sequnce of a thread.

Whether the database is RAC depends on the parameter cluster_database

One thing that distinguishes RAC from a stand-alone machine is that it has an GRD (global resource directory) memory area and several background processes and some database files attached to it. GRD records the distribution map of each data block among clusters, which is located in the shared pool of each instance's SGA, but each instance is a partial GRD. The GRD of all instances is summed together to form a complete GRD. This area is used to store the distribution of the same database on different nodes, that is, when multiple instances operate a data block concurrently, the data block is stored in the GRD memory area of their respective instances.

GRD can be thought of as a large partition table, and each instance is a partition in the partition table.

GRD Master: each object that is called into memory, including tables, indexes, cluster, etc., is assigned an instance of master.

GRD Master itself is a table: (V$GCSHVMASTER_INFO, V$GCSPFMASTER_INFO, V$HVMASTER_INFO)

Objectmaster_instance_id

T11

T22

T31

Idx_t12

...

Each instance maintains only the GRD records of those resources that the instance master.

For example, the data of GRD recorded in example 1 is the record of GRD such as T1, Magi T3, etc.

Obj#file#block#instance.

T110020002

T110014561

T2....

Each instance has an identical copy of the GRD Master table.

This master table records database objects, not some row or block of the database.

Image understanding of RAC instance access 1: for example, there is no record of database object table 1 in the master table record

Instance 1 accesses the block corresponding to a row of table 1, and finds that there is no table 1 in the master table, that is, table 1 has never been accessed, so the database records the master of table 1 as instance 1 in the master table.

Image understanding of RAC instance access 2: for example, the maser of the database object recorded in the master table 1 is instance 2

Instance 1 accesses the block corresponding to a row in Table 1, and instance 1 accesses instance 2. When instance 2 finds that the block is not in GRD, it tells instance 1 that the block is not in SGA, and instance 2 lets instance 1 access the disk via IO.

Instance 1 accesses the block corresponding to a row in Table 1, and instance 1 accesses instance 2. Instance 2 finds that the block is in GRD and on its own SGA. Instance 2 sends a copy of the block to instance 1.

Instance 1 accesses the block corresponding to a row in Table 1, instance 1 accesses instance 2, instance 2 finds that the block is in GRD and on instance 3, instance 2 tells instance 1 that the block is on instance 3, and instance 2 asks instance 3 to send a copy of the block to instance 1.

2 way and 3 way refer to how many nodes to jump

It is impossible for a RAC with only two nodes to have gc current 3-way, two nodes, and a data block is not here or there.

This node accesses the resource MASTER node

2-way: the resource MASTER and cached nodes are on the same node.

3 way: one more node. Resource MASTER and cached nodes are not the same node.

Understanding of RAC performance improvement: when the execution of sql is slow due to insufficient load, multiple instances can share the load (CPU, memory). When the load is not the performance bottleneck, RAC cannot improve the execution efficiency of a specific sql. On the contrary, the more instances, the worse the performance of a single SQL.

The understanding that the more instances, the worse the performance: for example, 10 nodes, instance A needs to access 100 blocks, of which 10 blocks in node 1 and 10 blocks in node 2. If 10 blocks are on node 10, you need to visit master,master once and then tell the specific node of the block, and these nodes will push the block to instance A, so that you need 1 instance to master access + 10 master to each node access + 10 each node push block to node A, a total of 11 visits + 10 GC block transfers

The essence of RAC is a database, which runs on multiple computers. It solves the concurrency problem through Distributed Lock Management (DLM: distributed Lock Manager). Because the resources of RAC are shared, in order to ensure the consistency of data, it is necessary to use DLM to coordinate competitive access to resources between instances. RAC's DLM is called Cache Fusion (memory Fusion).

Cache Fusion transfers data blocks between instances through high-speed Private Interconnect. It is the core working mechanism of RAC. It virtualizes the SGA of all instances into a large SGA area, thus making multiple nodes SGA transparent to users. Whenever different instances request the same data block, the data block is passed between instances through Private Interconnect. To avoid the inefficient implementation of first pushing the block to disk and then re-reading it into the cache of other instances. When a block is read into the cache of an instance in the RAC environment, the block is assigned a lock resource (unlike row-level locks) to ensure that other instances know that the block is being used. Later, if another instance requests a copy of the block and the block is already in the cache of the previous instance, the block is passed directly to the SGA of the other instance through Private Interconnect. If the block in memory has been changed, but the change has not been committed, a copy of the CR will be passed. This means that whenever possible, blocks can be moved between instance caches without having to write back to disk, thus avoiding the additional Icano cost of synchronizing multi-instance caches. In this way, for users, cache fusion virtualizes the database buffers of multiple instances into a database buffer, which enables SGA to be transparent to users. Obviously, the data cached by different instances can be different, that is, before an instance accesses a particular block, and it never accesses that block, it either cache fusion from another instance or reads it from disk. The entire Cache Fusion consists of two services: GCS and GES. GCS is responsible for the transfer of databases between instances, and GES is responsible for lock management. The first problem to be solved by Cache Fusion is the state distribution diagram of data block copies between cluster nodes, which is achieved through GRD.

To play the role of Cache Fusion, there must be a prerequisite, that is, the speed of the Internet is faster than the speed of accessing the disk! Otherwise, there is no point in introducing Cache Fusion.

GCS/GES

Global Cache Service Global caching Service (GCS): to be understood in conjunction with Cache Fusion. Global caching involves data blocks. The global cache service is responsible for maintaining cache consistency in the global cache, ensuring that an instance can obtain a global lock resource when it wants to modify a block at any time, thus avoiding the possibility that another instance will modify the block at the same time. The modified instance will have the current version of the block (both committed and uncommitted) as well as the post image of the block. If another instance also requests the block, the global caching service is responsible for tracking the instance that owns the block, what version of the block is owned, and what mode the block is in. GCS corresponding process LMSn (processes global cache fusion requests)

Global Enqueue Service Global queuing Service (GES): mainly responsible for maintaining consistency between dictionary cache and library cache. Dictionary cache is the cache of data dictionary information stored in the SGA of an instance for high-speed access. Because the dictionary information is stored in memory, changes made to the dictionary on one node, such as DDL, must be immediately propagated to the dictionary cache on all nodes. GES is responsible for handling the above situations and eliminating differences between instances. For the same reason, in order to analyze the SQL statements that affect these objects, the library cache locks on the objects in the database are removed. These locks must be maintained between instances, and the global queuing service must ensure that deadlocks do not occur between multiple instances requesting access to the same object. GES corresponding process LMON (issues heartbeates and performs recovery)

Some wait events for RAC

Gc buffer busy

That is, global cache buffer busy, the reason is similar to the buffer busy waits of a single instance, that is, the wait of an instance of node a requesting block from node b at a point in time. It is mainly caused by the modification operation, not by reading.

From 11g, gc buffer busy is divided into gc buffer busy acquire and gc buffer busy release.

Cause: hot blocks, inefficient sql (the more blocks are requested into the buffer cache, the more likely it is to cause other sessions to wait. ) data cross access (RAC database, where the same data is requested on different database instances), so RAC recommends that different application functions be accessed on different database instances

Gc buffer busy acquire is when session#1 attempts to request access to the remote instance (remote instance) buffer, but before session#1, another session#2 request on the same instance accesses the same buffer and does not complete, then session#1 waits for gc buffer busy acquire.

Gc buffer busy release is a session#2 request from a remote instance that accesses the same buffer before session#1 and does not complete, so session#1 waits for gc buffer busy release.

Gcs log flush sync

GCS log refresh synchronization

Flush is a mechanism for Oracle to ensure Instance Recovery instance recovery, which requires that every current block after the local node local instance has been modified (modify/update) must write the current block-related redo to the logfile (requiring that the LGWR must complete the write before it can be returned) before it can be transferred by the LMS process to other nodes.

The cause of this wait event 'gcs log flush sync' is mainly-Redo log IO performance.

RAC uses distributed lock management (DLM) mechanism to detect concurrency, and uses an example to illustrate the role of DLM.

(1) A 2-node RAC

(2) Node 1 wants to modify data 1

(3) Node 1 requests DLM, and DLM finds that data 1 is not used by any node, DLM is authorized to node 1, and DLM registers node 1's use of data 1.

(4) Node 2 also wants to modify data 1.

(5) Node 2 requests DLM, and if DLM finds that data 1 is used by node 1, DLM will request node 1 to "give it to node 2 first". Node 1 will release its occupation of data 1 after receiving the request, and node 2 can operate data 1.

(6) DLM records this process.

It should be emphasized that DLM is responsible for the coordination between nodes, while DLM is not responsible for intra-node coordination. Continue with the above example.

(1) now process 1 of node 2 modifies data 1

(2) process 2 of node 2 also wants to modify data 1.

(3) Node 2 still requests DLM,DLM to discover that Node 2 now has permissions and does not need authorization.

(4) process 2's request for DLM is passed, but whether process 2 can modify data 1 needs further checking.

(5) through the traditional lock mode, such as "row-level lock", process 2 finds that data 1 is being modified by process 1, so process 2 can only wait.

So learning RAC is learning DLM, or Cache Fusion (memory Fusion).

RAC cluster implements the process of concurrency mechanism:

At this point, the study of "how to understand and master RAC" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.