Oracle Memory Management and HugePage (serial 2) 07/01 Update SLTechnology News&Howtos

Oracle Memory Management and HugePage (serial 2)

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

MSMM

The management of SGA before 10G is achieved by manually setting a series of parameters, such as the following important parameters:

● buffer_cache_size

● shared_pool_size

● large_pool_size

● stream_pool_size

● log_buffer

SGA is a general term for a large chunk of memory, which is composed of the components listed above. In the version before 10g, DBA needs to manually configure each memory area of SGA, which may virtually become very high for a DBA. You need to understand the business, the amount of hot data, the use of SQL, whether there are a large number of different text SQL, and so on to decide how much memory to allocate to each area. Almost all DBA users who have used this version have encountered a classic error: ORA-04031, which is usually caused by insufficient shared pool memory, which may be caused by many reasons, such as serious fragmentation of shared pool memory, a large number of hard parsing, huge PL/SQL text, and so on. Let's take a look at how to consider allocating memory for the two most important memory components, buffer cache and shared pool, in the case of manually managing SGA.

Buffer Cache

Buffer Cache, commonly known as block cache, is used to cache copies of data blocks. Oracle expects these blocks to be reused efficiently through the improved LRU algorithm. The data that the server process reads from disk is placed in this memory, and when the data is modified, it is also in this area that the data is modified. There is no golden rule for allocating memory to buffer cache, but we still have to decide how much memory to allocate to buffer cache according to the memory size of the operating system, the amount of hot data of the business, the type of business and other factors. Let's first take a look at how to determine the size of the entire SGA. For the allocation principle of SGA, the official recommendations are as follows:

The format of the above formula is more or less the same as that of PGA. After removing 20% of the memory to the operating system, it is divided into different system types, basically OLTP systems, and most of the memory is given to buffer cache,DSS systems. Buffer cache is less important and can be given to 50% or less. As for how much of the SGA can be allocated to buffer_cache, it depends on the actual situation. DBA can get a rough idea of whether enough memory has been allocated to buffer cache by looking at the hit ratio of buffer cache. But friends who are familiar with me all know that I am a practitioner of OWI methodology, and the worship of hit rate should no longer exist in Oracle, but hit rate as an auxiliary means of performance diagnosis still has its value. If the SQL of the system is optimized enough and the hit rate is not high, to some extent, it means that the buffer cache is small or the data accessed by your business is very discrete, and the number of hot spots is too small. The optimization idea given by the shredded meat here is top-down, and the low hit rate is a representation. Does the top application call a large number of SQL that do not need to be called? is there a lack of index in these SQL? after SQL is optimized, does DBA consider increasing buffer cache to improve performance? it will not work after increase, or should we consider increasing system memory to further increase the size of buffer cache? Finally, do you need to add a few more disks to the database or use SSD to improve performance? The use of top-down analysis method is conducive to really find the problem and solve the problem at the lowest cost. If only one SQL slows down the whole database system due to the lack of index, and DBA directly adds the disk to expand the capacity of IO, then it will eventually be found that it costs so much to cure the symptoms but not the root cause, and the problem is still unsolved. The formula given above does not apply to all cases, especially in OLTP systems. If the number of processes is very large, it is necessary to further reduce the proportion of memory occupied by SGA in order to reserve more memory for PGA.

Shared Pool

In addition to the buffer cache area, the two largest blocks of memory occupied by Oracle are the memory of Shared Pool. Its structure is very complex, and a large amount of fragmented memory is often generated because of caching the non-standard size text of SQL code. Generally speaking, shared pool contains two major parts, one is the library cache area, which is used to cache SQL and PL/SQL code, save their execution plans, and improve the parsing efficiency of SQL. If your application code never uses binding variables, then this area of memory is a big burden for you, but the library cache area cannot be closed in Oracle, so for OLTP systems, make sure that SQL uses binding variables. The second largest area is the row cache area, which is used to cache the database's data dictionary, which is called row cache because the information stored in it is in the form of rows. For this block of memory, it depends on the number of meta-information (metadata) in the database. If there are hundreds of thousands of objects in the database, then this block of memory will occupy a large amount of memory. At the same time, many columns on the table have histogram information, which will also lead to a large memory footprint in this area. In a stable system, the memory in this area is basically static, and few operations in Oracle frequently modify the row cache area. One exception is sequence without the cache attribute. If this sequence is called frequently, it will trigger frequent changes to the property value of sequence, which may lead to some waiting for row cache lock. The optimization is to set enough cache values for each sequence. > if the application does not use binding variables and is difficult to modify, you can try to solve the problem by setting cursor_sharing to force.

Here is another case of personal experience. The old DBAs may not have many skills, but there is still a story. I once helped a customer solve a free buffer waits case. The appearance of this waiting event generally shows that the buffer cache is too small, or the whole table is scanned and written more slowly, and dirty data is slow to write. From the AWR report, Free buffer waits ranks first in TOP EVENT, occupying 70 to 80 percent of the DB TIME, and found from the awr report that the customer's shared pool memory footprint has reached nearly 50 gigabytes, while the AWR report of the analysis database a month ago, the shared pool memory is only about 3 gigabytes, which can basically be identified as the shared pool memory occupation is too large, resulting in insufficient buffer cache, leading to free buffer waits waiting events. After communicating with customers, I know that it has been relatively stable before, and the major change I have made recently is to use oracle's flahcache for the database. After searching for the flashcache keyword on MOS, I finally found that if Oracle RAC used flashcache in 11.2.0.4, it would allocate additional memory in shared pool to store GCS resources. The extra memory is: 208 bytes * the total number of (flash) buffer blocks. Because here to remind the majority of DBA friends, if you plan to use Oracle FlashCache then please reserve enough shared pool memory for it.

ASMM

The 10G version of Oracle introduces ASMM, automatic SGA management, which to some extent helps DBA to solve the problem of managing SGA. ASMM is enabled by setting the parameter sga_target to a non-zero value. But in 10GR1, including the early version of 10GR2, ASMM is not mature enough, there are more BUG, resulting in more problems, so at that time DBA in some core production environment, or the use of 9i manual SGA management. Automatic SGA management, no longer need to set values for each memory component, of course, if you set sga_target while setting these parameters db_cache_size, then db_cache_size these parameter values will be required as a minimum.

If sga_target is set to 0 and the target is also 0, then return to traditional MSMM management. The ORA-04031 errors mentioned above can be largely resolved by using ASMM. After using ASMM, check the spfile file and find that some memory parameters such as _ _ db_cache_size are added. These memory parameters are dynamically generated and solidified into the spfile during the operation of the Oracle instance. If the instance runs for long enough, the values of these parameters are fixed, which is equivalent to having a best practice parameter based on your own environment. After the database instance is restarted, These parameter values that are solidified in spfile are used. > the 11G version may find the movement of Granule between the pool of SGA, even if the sga_target is set to 0 and the target is also set to 0. This is the new Linux feature of 11G, which is controlled by the parameter _ memory_imm_mode_without_autosga. >

Granule

The implementation of ASMM technology is internally realized by moving Granule between memory components. Here is an explanation for Granule. The automatic shared memory management introduced by Oracle 10G is designed to solve the following question, "how big should I set the db_cache_size?" How big should I set the shared_pool_size? So after 10 gigabytes, Oracle's answer is, "you don't have to think about this anymore. I'll do it for you through the algorithm." To put it simply, compare the disk IO time reduced by increasing block cache with the time saved by increasing shared pool optimization to determine whether memory needs to be moved between the two. To make the memory move between db_cache_size and shared_pool_size efficiently, Oracle reconstructs SGA in version 9i, that is, using fixed-size memory blocks-Granule,Granule varies with operating system, Oracle version, and SGA size. Readers can view the size of the Granule of the current instance through the following SQL statement:

AMM

With the automatic management of PGA and the automatic management of SGA, the obsessive-compulsive disorder Oracle finally launched AMM, automatic memory management in 11G version. Because large pages cannot be used after using AMM, this function has not been widely used. Through this function, DBA only needs to set the memory_target parameter to complete the memory configuration of the entire database instance. Just as ASMM leads to the generation of some double underscore implicit parameters, AMM also leads to the generation of some double underscore implicit parameters, such as _ _ pga_aggregate_target,__sga_target. And before version 11.2.0.2, DBCA database creation defaults to AMM management (sometimes Oracle is really bold). When starting DBCA database construction after 11.2.0.3, the operating system memory size will be detected. Memory larger than 4G defaults to ASMM, and less than 4G defaults to AMM. Similarly, if you set AMM and other parameters such as SGA_TARGET, then these parameters will be required as a minimum.

The biggest problem with AMM is that it cannot use large pages of memory. The importance of using large pages will be described in detail later in this article. > Doc 749851.1 > > Doc 1453227.1.

Due to the emergence of AMM after 11G, if you have been doing DBA for long enough, you must have encountered the following error

This error gives people an inexplicable feeling: "the MEMORY_TARGET feature is not supported". In fact, it is not that the feature is not supported. This is because AMM uses the operating system's shared memory file system, which is located under / dev/shm. If the configured memory file system is relatively small, it will report an error if it is less than the value of memory_target. Generally, on mainstream Linux operating systems, the value of this memory shared file system is half of the memory size. If the reader encounters this problem, either reduce the parameter value of memory_target or modify the size of the shared memory file system by:

It is pointed out here that 11G ASM uses AMM by default, and officials strongly recommend that this default behavior not be modified. The minimum MEMORY_TARGET required in ASM is 256m. If you set a value below 256m, Oracle automatically increases the memory to 256m. Oracle strongly recommends setting memory_target to 1536m for the release of ASM: 11.2.0.3Universe 11.2.0.4. Oracle strongly recommends setting this value to 1536m, which has proved to be sufficient in most environments.

After using ASM, the shared_pool memory of the database instance needs to be re-evaluated. The following formula is the additional size that needs to be added to the normal shared pool:

? If the ASM disk group is external redundancy, you need to add 1MB for every 100G of space on a 2MB basis.

? If the ASM disk group is normal redundancy, you need to add 1MB for every 50G of space on a 4MB basis.

? If the ASM disk group is high redundancy, you need to add 1MB for every 33G of space on a 6MB basis.

ASM & Shared Pool (ORA-4031) (document ID 437924.1)

Large page

For Linux-like systems, CPU must translate virtual addresses into physical memory addresses in order to really access memory. In order to improve the efficiency of this conversion, CPU caches the mapping between the nearest virtual memory address and physical memory address and saves it in a mapping table maintained by CPU. In order to maximize memory access speed, it is necessary to save as many mapping relationships as possible in the mapping table. This mapping table should be held by each process in Linux. If the mapping table is too large, it will greatly reduce the TLB hit rate of CPU. In mainstream Linux operating systems, the default page size is 4K, which will generate a lot of page table entries for large memory. As mentioned above, the next page table of Linux is not shared, each process has its own page table, and now the memory of any host is configured with dozens of gigabytes. Hundreds of gigabytes, or even on T, if you run Oracle on it without using large pages, you are basically looking for death, because Oracle is a multi-process architecture, each connection is an exclusive process, large memory + multi-processes + do not use large pages = disaster, shredded meat in eight years of DBA career, at least five customers have helped to deal with system failures caused by not using large pages, and these five customers have encountered in the past three or four years Why is the word "big page" not mentioned frequently three or five years in advance, while at present, the word "big page" has become a hot word in various technical conferences and best practices? This is because in recent years, Linux systems have been widely used, and large memory has sprung up everywhere, and now a database system is prone to hundreds of connections, which have contributed to the fact that large pages have been paid more and more attention.

The benefits of large pages

Let's take a look at the benefits of using large pages:

Less page table entries in ●, less memory in page table

● pin lives in SGA, no page out.

● increases TLB hit rate and reduces kernel cpu consumption

On systems that do not use large pages, it is often possible to find dozens of hundreds of gigabytes of page tables. In severe cases, the sys part of the system CPU consumes a lot. These are some symptoms of not using large pages.

The features / shortcomings of a large page

● should be allocated in advance.

● is not flexible enough and even needs to restart the host.

If ● is allocated too much, it will cause waste and cannot be used by other programs.

The method of allocating large pages

Set a reasonable value in 2MB for large pages by adding the vm.nr_hugepages parameter to the file / etc/sysctl.cnf. Or you can temporarily set a large page by echo a value into / proc/sys/vm/nr_hugepages. As for how much value should be set for large pages, this should be determined according to the configuration of the system SGA. It is generally recommended that the total occupation of large pages is greater than the sum of all SGA + 2GB on the system.

HugePages on Oracle Linux 64-bit (document ID 361468.1), AIX page table sharing, generally do not need to set up large pages.

The principle of large pages

The following is based on a 32-bit system, 4K memory page size calculation: 1) Catalog table, used to store page table location, contains a total of 1024 directory entry, each directory entry points to a page table location, each directory entry,4b size, directory table total 4b*1024=4K size 2) page table, used to store the physical address page start address, each page table entry is also 4b size, each page table has a total of 1024 page table entry Therefore, the size of a page table is also 4K, with a total of 1024 page tables, so the maximum size of the page table is 1024*4K=4M size 3) each page table entry points to 4K physical memory pages, so the total physical memory size that the page table can point to is: 1024 (number of page tables) * 1024 (number of entry per page table) * 4K (page size represented by a page table entry) = 4G 4) when the operating system maps virtual addresses to physical addresses The 10 bits of the virtual address 31-22 are used to index from the directory table to one of the 1024 page tables, and the 10 bits of the virtual address 12-21 are used to index from the page table to one of the 1024 page tables entry. Get the starting address of the physical memory page from the page table entry, and then use the 0-12 bits of the virtual address as the offset in the 4KB memory page, then the physical memory address plus the offset is the physical memory address that needs to be accessed.

Because the 4m page table does not appear in the 32-bit operating system, because a process cannot use the memory space of 4GB, some space is reserved, such as memory for the operating system kernel. And the page table entry is created when the process accesses a piece of memory, not at the beginning.

Page table memory calculation

In a 32-bit system, a process accesses 1GB memory and produces a 1m page table. In a 64-bit system, it will grow to 2m. It is easy to calculate that if a SGA is set to 60G and there are 1500 Oracle user processes, on a 64-bit Linux system, the largest page table occupies memory: 602GB 1500amp 1024g 175g, yes, you read it correctly, it is 175g! However, the actual page table occupation may not be so large, with a discount of 40% to 50%, because only after the server process accesses a specific area of the SGA, the process needs to add this corresponding page table entry to its own page table.

Version 11.2.0.3

Before version 11.2.0.3, if the number of large pages allocated is insufficient, large pages will not be used during Oracle startup, but after 11.2.0.3, Oracle will also try to use large pages at startup. If the number of large pages is not enough, the currently configured large pages will be used up, and the insufficient parts will be obtained from small pages. This behavior actually controls USE_LARGE_PAGES through a parameter of Oracle, which will be explained in more detail later. You can clearly see this happening through the alert file of the database instance:

Total Shared Global Region in Large Pages = 1024 MB (85%), which means that only 85% of the SGA is placed on the large page. In the RECOMMENDATION section, it is recommended that you add at least 89 large pages to make the SGA fully on the big page.

USE_LARGE_PAGES

This parameter is used to control the usage behavior of large page memory when the database instance is started. There are three values before version 11.2.0.3, and there is an additional value auto in version 11.2.0.3:

? The default value of true is to use large pages as much as possible. There may be some pages in large pages and some in small pages. In this case, memory segmentation may be seen through the operating system command ipcs-ma (memory segmentation may be caused by multiple situations. For example, enabling numa may also cause memory segmentation).

? False does not use large pages

? The only option means to force the use of large pages, otherwise you cannot start

? Auto (11.2.0.3) use large pages as much as possible through the background process dism echo xxx > / proc/sys/vm/nr_hugepages when the instance is started

The following code is the output of alert with the parameter set to auto:

You can see that in the process of starting the instance, the DISM process is started first, and the large page configuration is automatically completed through this process. The permission of $Oracle_HOME/bin/oradism is also the root permission, because the grid permission cannot complete the configuration of the large page echo xxx > / proc/sys/vm/nr_hugepages.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.