How to optimize Oracle Database with large memory pages in Linux 07/03 Update SLTechnology News&Howtos

How to optimize Oracle Database with large memory pages in Linux

2025-07-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article mainly introduces how to optimize the Oracle database of large memory pages in Linux. It is very detailed and has a certain reference value. Interested friends must read it!

I. introduction of cases

There are serious performance problems in one of the customer's systems. When the problem occurs, the system is basically unavailable, and all business operations on the application are completely unresponsive. The database of the system is Oracle 10.2.0.4 Oracle Database,CPU running under RHEL 5.2 (Red Hat Enterprise Linux Server release 5 (Tikanga)) is 4 4-core extremely strong processors (Intel (R) Xeon (R) CPU E7430 @ 2.13GHz), that is, logical CPU is 16, memory 32GB. During the failure, the CPU of the database server remained at 100% for a long time. Even after shutting down all the Weblogic Server of the application, the CPU utilization of the database server has been 100% for a few minutes, and then decreases gradually, and it will take about 20 minutes to fall to the normal idle state, because at this time, all applications have been closed, and only a very low CPU utilization is normal. According to the database maintenance staff of this system, this situation has occurred many times, even after restarting the database, such a failure will occur in less than a day or two. At the same time, the system has not changed much recently.

After receiving the fault report, the author is very slow to connect to the database through SSH, and it takes about 1 minute to connect. First of all, a simple look at the performance of the server, the development of IO is very low, there is still a lot of memory surplus, at least more than 1GB, there is no page in / page out. The most obvious phenomenon is that the utilization rate of CPU is quite high, which has been maintained at 100%, while the SYS part of the utilization rate of CPU is more than 95%. And the operating system running queue has been above 200. The server memory usage is as follows:

Cat/proc/meminfoMemTotal: 32999792 kBMemFree: 1438672 kBBuffers: 112304 kBCached: 23471680 kBSwapCached: 1296 kBActive: 19571024 kBInactive: 6085396 kBHighTotal: 0 kBHighFree: 0 kBLowTotal: 32999792 kBLowFree: 1438672 kBSwapTotal: 38371320 kBSwapFree: 38260796 kBDirty: 280kBWriteback: 0kBAnonPages: 2071192 kBMapped: 12455324 kBSlab: 340140 kBPageTables: 4749076 kBNFS_Unstable: 0 kBBounce: 0 kBCommitLimit: 54871216kBCommitted_AS: 17226744 kBVmallocTotal:34359738367 kBVmallocUsed: 22016 kBVmallocChunk:34359716303 kB

From a phenomenal point of view, SYS CPU height is an important clue to analyze the problem.

After learning about the performance at the operating system level as quickly as possible, immediately connect to the database through Sqlplus and view the performance information inside the database:

(note: the following data about SQL, server name, database name and other related information have been processed. )

SQL > select sid,serial#,program,machine,sql_id,eventfrom v$session where type='USER' and status='ACTIVE' SID SERIAL# PROGRAM MACHINE SQL_ID EVENT- 4304 xxx_app1 0gc4uvt2pqvpu latch: cache buffers chains459 12806 xxx_app1 0gc4uvt2pqvpu latch: cache buffers chains454 5518 xxx_app1 15hq76k17h5ta latch: cache buffers chains529 7708 xxx_app1 0gc4uvt2pqvpu latch: cache buffers chains420 40948 xxx_app1 0gc4uvt2pqvpu latch: cache buffers chains353 56222 xxx_app1 f7fxxczffp5rx latch: cache buffers chains243 42611 xxx_app1 2zqg4sbrq7zay latch: cache buffers chains458 63221 xxxTimer.exe APPSERVER 9t1ujakwt6fnf local write wait... To save space Omit some contents. 409 4951 xxx_app1 7d4c6m3ytcx87 read by other session239 51959 xxx_app1 7d4c6m3ytcx87 read by other session525 3815 xxxTimer.exe APPSERVER 0ftnnr7pfw7r6 enq: RO-fast object reu518 7845 xxx_app1 log file sync473 1972 xxxTimer.exe APPSERVER 5017jsr7kdk3b log file sync197 37462 xxx_app1 cbvbzbfdxn2w7 db file sequential read319 4939 xxxTimer.exe APPSERVER 6vmk5uzu1p45m db file sequentialread434 2939 xxx_app1 gw921z764rmkc latch: shared pool220 50017 xxx_app1 2zqg4sbrq7zay latch: library cache301 36418 xxx_app1 02dw161xqmrgf latch: library cache193 25003 oracle@xxx_db1 (J001) xxx_db1 jobq slave wait368 64846 oracle@ Xxx_db1 (J000) xxx_db1 jobq slave wait218 13307 sqlplus@xxx_db1 (TNS V1-V3) xxx_db1 5rby2rfcgs6b7 SQL*Net message to client435 1883 xxx_app1 fd7369jwkuvty SQL*Net message from client448 3001 xxxTimer.exe APPSERVER bsk0kpawwztnwSQL*Net message from dblinkSQL > @ waiteventSID EVENT SECONDS_IN_WAIT STATE -556 latch:cache buffers chains 35 WAITED KNOWN TIME464 latch:cache buffers chains 2 WAITING427 latch:cache buffers chains 34 WAITED SHORT TIME458 localwrite wait 63 WAITING403 writecomplete waits 40 WAITING502 writecomplete waits 41 WAITING525 enq:RO-fast object reuse 40 WAITING368 enq:RO-fast object reuse 23 WAITING282 dbfile sequential read 0 WAITING501 dbfile sequential read 2 WAITED SHORT TIME478 dbfile sequential read 0 WAITING281 dbfile sequential read 6 WAITED KNOWN TIME195 dbfile sequential read 4 WAITED KNOWN TIME450 dbfile sequential read 2 WAITED KNOWN TIME529 dbfile Sequential read 1 WAITING310 dbfile sequential read 0 WAITED KNOWN TIME316 db filesequential read 89 WAITED SHORT TIME370 dbfile sequential read 1 WAITING380 dbfile sequential read 1 WAITED SHORT TIME326 jobq slave wait 122 WAITING378 jobq slave wait 2 WAITING425 jobq slave wait 108 WAITING208 SQL*Net more data from db 11 WAITED SHORT TIME link537 Streams AQ: waiting for t 7042 WAITING ime management or cleanup tasks549 Streams AQ: qmn coordinat 1585854 WAITING or idle wait507 Streams AQ: qmn slave idl 1585854 WAITING e wait430 latch free 2 WAITED KNOWN TIME565 latch:cache buffers lru 136 WAITED SHORT TIME chain

Judging from the activity and wait events in the database, there is not much exception. It is worth noting that when the CPU utilization of the database server is 100% for a long time, or when the physical memory is exhausted and accompanied by a large number of swapping memory swapping out, you need to carefully diagnose the performance phenomena in the database, such as some kind of more waiting events, is it the result of CPU or insufficient memory, or is it the Root Cause that causes excessive CPU or memory depletion in these databases.

Judging from the above data, there are not very many active sessions, less than 50, and the number of background processes is quite different from the 200 runs in the operating system. There are mainly three types of non-idle wait events in the database, IO-related wait events such as db file sequential read,database link-related SQL*Net more data from dblink and latch-related wait events. In these three categories, usually only wait events such as latch cause an increase in CPU utilization.

By analyzing and comparing the AWR report, there is no particularly significant difference in database activity between the failure period and the normal period. But in terms of system statistics, there is a big difference:

StatisticName 1st 2nd Value---BUSY_TIME 3475776 1611753IDLE_TIME 2266224 4065506IOWAIT_TIME 520453 886345LOAD-67-3NICE_TIME 0 0NUM_CPU_SOCKETS 0 0PHYSICAL_MEMORY_BYTES 0 0RSRC_MGR_CPU_WAIT_TIME 0 0SYS_TIME 1802025 205644USER_TIME 1645837 1381719

The above data is a comparison of AWR with 1 hour of failure period (1st) and 1 hour of normal time period (2nd). For fault analysis, especially when the fault time is short, the one-hour AWR report will not accurately reflect the performance during the fault. But when we are in Trouble Shooting, the first thing is to determine the direction from all kinds of data. As mentioned earlier, the high CPU utilization in the SYS section is an important clue, and when other performance data within the database is not much the same, you can start with CPU.

2. Analysis of the use of CPU in operating system

So, what do the two different utilization rates of SYS and USER represent in the operating system? Or what's the difference between the two?

To put it simply, the SYS part of CPU utilization refers to the CPU part used by the operating system kernel (Kernel), that is, the CPU consumed by running code in kernel state, and the most common is the CPU consumed by system calls (SYS CALL). The USER part is the CPU part used by the application's own code, that is, the CPU consumed by the code running in the user mode. For example, when Oracle executes SQL, it needs to initiate a read call when reading data from disk to db buffer cache. This read call is mainly run by the code of the operating system kernel, including the device driver, so the consumed CPU is calculated to the SYS part; while when Oracle parses the data read from the disk, only Oracle's own code is running, so the consumed CPU is calculated to the USER part.

So which operations or system calls will mainly generate the CPU in the SYS part:

1. Icano operations, such as reading and writing files, accessing peripherals, transferring data over the network, etc. This part of the operation generally does not consume too much CPU, as the main time consumption will be on the device operated by the IO. For example, when reading files from disk, the main time is on the internal operation of the disk, while the CPU time consumed is only a small part of the response time of the IWeiO operation. It is only possible to increase SYS CPU when the concurrency Ibank O is too high.

two。 Memory management, such as application processes applying for memory from the operating system, operating system maintenance of available memory, swap space page swapping, etc. In fact, similar to Oracle, the larger the memory and the more frequent memory management operations, the higher the CPU consumption.

3. Process scheduling. The use of this part of CPU lies in the length of the running queue in the operating system, the longer the running queue, indicating that the more processes need to be scheduled, the higher the burden on the kernel.

4. Others include inter-process communication, semaphore processing, some activities within the device driver, and so on.

From the performance data of system failure, memory management and process scheduling may be the reasons for the high SYS CPU. However, the high running queue is probably due to the high CPU utilization rather than the high CPU utilization caused by the high running queue. From the database, the number of active sessions is not very high. So next, you need to pay attention to whether the CPU utilization is too high due to problems with system memory management.

Reviewing the system memory data collected at the beginning of this article in / proc/meminfo, you can find an important piece of data:

PageTables: 4749076 kB

As you can see from the data, PageTables memory reaches 4637MB. PageTables literally means "page table". To put it simply, it is a table that the operating system kernel uses to maintain the correspondence between the linear virtual address and the actual physical memory address of the process.

Modern computers usually manage and allocate physical memory in Page Frame units. On x86 processor architecture, the page size is 4K. For processes running on the operating system, the accessible address space is called the virtual address space, which is related to the number of processor bits. For 32-bit x86 processors, the accessible address space of the process is 4GB. Each process running in the operating system has its own independent virtual address space or linear address space, and this address space is also managed by Page, which is usually 4KB in Linux. When a process accesses memory, it is responsible for converting the virtual address of the process into a physical address with the cooperation of the operating system and hardware. Two different processes may have the same virtual linear address and point to the same physical memory, such as shared memory, or different, such as the private memory of the process.

The following is a schematic diagram of the correspondence between virtual addresses and physical memory:

Suppose there are two processes An and B, each with a memory pointer pointing to 0x12345 (0x represents a hexadecimal number), such as one process fork or clone out of another process, then the two processes will have pointers to the same memory address. When a process accesses the memory pointed to by the 0x12345 address, the operating system translates this address into a physical address, for example, process An is 0x23456 and process B is 0x34567, which does not affect each other. So when did you get this physical address? For process private memory, which is the case in most cases, it is obtained by the process when it requests an allocation of memory from the operating system. When a process requests for memory allocation from the operating system, the operating system allocates free physical memory to the process in Page units, and generates a virtual thread address for the process to establish a mapping relationship between the virtual address and the physical memory address, which is returned to the process as a result.

Page Table (page table) is the data structure used by the operating system to maintain the correspondence between the virtual address and physical memory of the process. The following figure is a schematic diagram of Page Table in a relatively simple case:

The following briefly describes how the operating system translates between the virtual address and the actual physical address of a process when the page size is 4K on a 32-bit system.

1. A catalog table is a data structure used to index a page table. Each catalog entry occupies 32 bits, that is, 4 bytes, and stores the location of a page table. The catalog table takes up exactly 1 page of memory, or 4KB, and can store 1024 catalog entries, that is, the location where 1024 page tables can be stored.

two。 The page table entry (Page Table Entry) is 4 bytes in size and stores a physical memory page start address. Each page table also takes up 4K of memory and can store 1024 physical memory page start addresses. Because the starting address of the physical memory page is aligned in 4KB, only 20 bits of the 32 bits are needed to represent the address, and the other 12 bits are used for other purposes, such as indicating whether the 1 memory page is read-only or writable, and so on.

3. 1024 page tables, each page table has 1024 physical memory page start addresses, with a total of 1m addresses. Each address points to a physical memory page size of 4KB, totaling 4GB.

4. When the operating system and hardware map the virtual address to a physical address, the 10 bits 31-22 of the virtual address are used to index from the directory entry to one of the 1024 page tables; the 10 bits of the virtual address 12-21 are used to index from the page table to one of the 1024 page table items. Get the starting address of the physical memory page from the page table entry to this index, and then use the 12 bits of the virtual address 0-11 as the offset in the 4KB memory page. Then the physical memory page start address plus offset is the address of the physical memory that the process needs to access.

Then look at how much space will be occupied by the two data structures, the catalog table and the page table. Catalog tables are fixed only by 4KB. And the page table? Because there are a maximum of 1024 page tables, each page table consumes 4KB, so page tables consume up to 4MB memory.

In fact, processes in 32-bit Linux usually don't have such a large page table. It is impossible for a process to use up all the 4GB-sized address space, and there is even a 1GB virtual address space allocated to the kernel. At the same time, Linux will not set up such a large page table for the process at one time, and the operating system will establish the corresponding address mapping for the process only when the process is allocating and accessing memory.

Only the simplest paging mapping is described here. In fact, the page table directory and the page table have a total of four levels. At the same time, when PAE or 64-bit system is enabled under 32-bit, the page table structure is more complex than the above diagram. But in any case, the structure of the last level, the page table, is consistent.

In a 64-bit system, the size of a page table entry in a Page Table (page table) changes from 32-bit to 64-bit compared to 32-bit. So how much impact will this have? If a process accesses physical memory with 1GB, that is, 262144 pages of memory, in a 32-bit system, the page table requires 262144*4/1024/1024=1MB, while in a 64-bit system, the page table doubles the space occupied, which is 2MB.

Then take a look at what happens to the Oracle database running in the Linux system. The SGA size 12GB of the database in this case, if an OracleProcess accesses all the SGA memory, its page table size will be 24MB, which is an astonishing number. PGA is ignored here because the average PGA per process is no more than 2m, which is too small compared to SGA. According to the AWR report, there are about 300 sessions, so the page table of these 300 connections will reach 7200MB, but not every process will access all the memory in SGA. From meminfo, we can see that the size of Page Tables reaches 4637MB. Such a large Page Table space is exactly the result of 300 sessions, and the size of SGA reaches the result of 12GB.

It is clear that Page Table is not the only memory-managed data structure in the system, but there are other data structures for managing memory. These excessive memory management structures will undoubtedly greatly increase the burden of the operating system kernel and the consumption of CPU. However, large changes in memory requirements caused by load changes or other reasons, such as multiple processes requesting a large amount of memory at the same time, may cause CPU to peak in a short period of time, thus causing problems.

Third, use large memory pages to solve the problem

Although there is no hard evidence, and there is not enough time to gather enough evidence to prove that a large Page Table is causing the problem, it requires more than half an hour of system unavailability failures. But for now, this is the biggest suspicion. Therefore, it was decided to use large memory pages first to tune the memory usage of the system.

Large memory pages are collectively referred to as Large Page in lower versions of Linux and Huge Page in current mainstream Linux versions. Let's take Huge Page as an example to illustrate the advantages of Huge Page and how to use it.

What are the benefits of using large memory pages:

1. Reduce the page table (Page Table) size. Each Huge Page corresponds to continuous 2MB physical memory, so the physical memory of 12GB only needs the Page Table of 48KB, which is much less than the original 24MB.

2. Huge Page memory can only be locked in physical memory and cannot be swapped to the swap area. This avoids the performance impact caused by switching.

3. Due to the reduction of the number of page tables, the hit rate of TLB in CPU (which can be understood as CACHE of CPU to page tables) is greatly improved.

4. The page table for Huge Page can be shared among processes, and the size of Page Table is reduced. In fact, this can reflect the shortcomings of Linux in the paging processing mechanism. Other operating systems, such as AIX, avoid this problem of Linux by sharing the same page table for memory such as shared memory segments. For example, in a system maintained by the author, the number of connections is usually more than 5000, and the SGA of the instance is about 60GB. If you press the Linux paging method, most of the memory in the system will be used by the page table.

So, how do you enable large memory pages (Huge Page) for Oracle? The following are the implementation steps. Since the database involved in the case adjusts the SGA to 18G after a period of time, take 18G as an example:

1. Check / proc/meminfo to confirm that the system supports HugePage:

HugePages_Total: 0HugePages_Free: 0HugePages_Rsvd: 0Hugepagesize: 2048 kB

HugePages Total represents the number of large memory pages configured in the system. HugePages Free indicates the number of large memory pages that have not been accessed, where free is easily misunderstood, which will be explained later. HugePages Rsvd indicates the number of pages that have been allocated but not yet used. Hugepagesize represents the large memory page size, here is 2MB, note that in some kernel configurations it may be 4MB.

For example, the total amount of HugePages is 11GB, which is 10GB, which is 10GB, and the target is 8GB. After the database is started, the HugePage memory is allocated according to SGA_MAX_SIZE. Here is 10GB. The HugePage memory of the real Free is 11-10 gigabytes. But SGA_TARGET only has 8GB, so there will be 2GB that will not be accessed, then HugePage_Free is 2GB, 1GB, 3GB, and there is 2GB in Rsvd. The only thing that can actually be used for other instances here is 1GB, that is, the real Free is only 1GB.

two。 The number of memory pages scheduled to be set. So far, large memory pages can only be used for a small number of types of memory such as shared memory segments. Once physical memory is used as large memory pages, it cannot be used for other purposes, such as private memory for processes. Therefore, too much memory cannot be set to large memory pages. We usually use large memory pages as the SGA of the Oracle database, so the number of large memory pages:

HugePages_Total=ceil (SGA_MAX_SIZE/Hugepagesize) + N

For example, if the SGA_MAX_SIZE set for the database is 18GB, then the number of pages can be ceil (18 pages 1024 Universe 2) + 2 pages 9218.

By adding N here, you need to set the HugePage memory space to be slightly larger than SGA_MAX_SIZE, usually 1-2. Looking at the size of the shared memory segment through the ipcs-m command, we can see that the size of the shared memory segment is actually about larger than that of SGA_MAX_SIZE. If you have multiple Oracle instances on the server, you need to consider the extra shared memory segment for each instance, that is, the higher the N value. In addition, Oracle databases either use large memory pages or do not use large memory pages at all, so inappropriate HugePages_Total will cause a waste of memory.

In addition to using SGA_MAX_SIZE calculation, a more accurate HugePages_Total can also be calculated from the shared memory segment size obtained by ipcs-m.

HugePages_Total=sum (ceil (share_segment_size/Hugepagesize))

3. Modify the / etc/sysctl.conf file by adding the following line:

Vm.nr_hugepages=9218

Then execute the sysctl-p command to make the configuration effective.

Here the parameter vm.nr_hugepages is the number of large memory pages calculated in step 2. Then check / proc/meminfo, and if the HugePages_Total is less than the set number, it indicates that there is not enough contiguous physical memory for these large memory pages and the server needs to be restarted.

4. Add the following line to the / etc/security/limits.conf file:

Oracle soft memlock 18878464oracle hard memlock 18878464

Here you can set the amount of memory that oracle users can lock, in KB.

Then reconnect to the database server as an oracle user, using the ulimit-a command, you can see:

Max lockedmemory (kbytes,-l) 18878464

It is also possible to configure memlock as unlimited here.

5. If the database uses MANUAL to manage SGA, it needs to be changed to AUTO, that is, SGA_TARGET_SIZE is set to a value greater than 0. For 11g, because HugePage can only be used for shared memory and cannot be used for PGA, AMM cannot be used, that is, MEMORY_TARGET cannot be set to greater than 0, and SGA and PGA,SGA can only be managed in AUTO mode.

6. Finally, start the database and check / proc/meminfo to see if HugePages_Free has been reduced. If it has been reduced, HugePage Memory has been used.

However, when I looked at the / proc/meminfo on the failed database server, I found that there was no information related to HugePage, and sysctl-a looked at all the system parameters and did not find the parameter vm.nr_hugepages. This is due to the fact that the Linux kernel is not compiled into HugePage. We need to use another kernel to enable HugePage.

View / boot/grub/grub.conf:

# grub.confgenerated by anaconda# Note thatyou do not have to rerun grub after making changes to this file#NOTICE: You have a / boot partition. This means that# all kerneland initrd paths are relative to / boot/ Eg.# root (hd0,0) # kernel/vmlinuz-version roroot=/dev/VolGroup00/LogVol00 # initrd/initrd-version.img#boot=/dev/cciss/c0d0default=0timeout=5splashimage= (hd0,0) / grub/splash.xpm.gzhiddenmenutitle Red HatEnterprise Linux Server (2.6.18-8.el5xen) with RDACroot (hd0,0) kernel/ xen.gz-2.6.18-8.el5module / vmlinuz-2.6.18-8.el5xen roroot=/dev/VolGroup00/LogVol00 rhgb quietmodule / mpp-2.6.18-8.el5xen.imgtitle Red HatEnterprise Linux Server 2.6.18-8.el5xen) root (hd0,0) kernel / xen.gz-2.6.18-8.el5module / vmlinuz-2.6.18-8.el5xen roroot=/dev/VolGroup00/LogVol00 rhgb quietmodule/initrd-2.6.18-8.el5xen.imgtitle Red HatEnterprise Linux Server-base (2.6.18-8.el5) root (hd0,0) kernel / vmlinuz-2.6.18-8.el5 roroot=/dev/VolGroup00/LogVol00 rhgb quietmodule/initrd-2.6.18-8.el5.img

It is found that the kernel used by this system has the word "xen". We modify this file to change default=0 to default=2, or mask the first two kernels with #, then restart the database server and find that the new kernel already supports HugePage.

When large memory pages are enabled in the database, the performance problems described in this article do not occur even when the SGA is increased. Looking at / proc/meminfo data, the memory consumed by PageTables has been kept below 120m, reducing 4500MB compared to the original. It is observed that the utilization of CPU is also lower than that before using HugePages, and the operation of the system is quite stable, at least there is no BUG caused by the use of HugePage.

The test shows that for the OLTP system, when HugePage is enabled on the Linux running the Oracle database, the database processing capacity and response time are improved in varying degrees, up to more than 10%.

The above is all the contents of the article "how to optimize the Oracle database of large memory pages in Linux". Thank you for reading! Hope to share the content to help you, more related knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.