In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-30 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
Original title: what must be known and done in the process of Oracle RAC construction author: Zhao Hai, a city commercial bank system architect, specializes in banking data center solution planning and design.
In the process of database construction, there is a lack of practical standards that can be followed, knowledge points and experience points scattered everywhere is difficult to quickly form logical reference standards, in the face of the project, it is impossible to start. This paper summarizes a large number of documents based on the planning, implementation and configuration optimization of data construction, and summarizes the matters that should be paid attention to at all levels in the process of database construction from the practice of the project. For long articles of high quality, it is recommended to collect them first and read them later.
1. Background description
Database construction is a very important link in the construction process of every enterprise data center, which is directly related to business continuity and stability. However, there are few practical standards that can be followed in the process of database construction. When we are faced with the planning, design and configuration optimization of the whole construction project, we feel that we don't know how to start. Some knowledge points and experience points scattered on the official website do not allow us to quickly form a very logical reference standard. Through the summary and analysis of the following space, this paper hopes to give practical standards from all levels, so as to provide a reference idea for the projects engaged in database construction in the future.
two。 Key points of storage planning and design 2.1 reasonable planning of OCR/VOTE disks
What is an OCR/VOTE disk and what role does it play in a cluster?
In ORACLE RAC ASM management mode, there are usually three disk groups (+ DATA,+FRA,+OCR). The data stored on all disks in the OCR disk group consists of two parts, one is Vote File, the other is OCR (Oracle Cluster Registry). Vote File is used to record the disk heartbeat information of cluster nodes, while OCR is the data to save cluster configuration information. Vote File, which is stored on OCR disk as a whole file without making any stripes. The following figure is an illustration of its information record:
The above is a schematic matrix of the Vote FIle of a three-node ORACLE RAC cluster. Each line is the information written by a node. For example, in the first line, Instance1 writes the results of its private network detection of the three members of the cluster (1, 2, 3) into the arbitration file, and Instance2 and Instance3 also write their detection results into the arbitration file, and finally form the arbitration matrix of the three nodes. When the private network fails and the cluster is divided into several island subsets from the network, the cluster uses the information of this file to determine the last surviving node. There are two very important rules for a specific algorithm:
1. Ensure the survival of the subset with the largest number of nodes in the isolated cluster subset.
two。 When the number of arbitration votes obtained by the isolated cluster subset is equal, the survival of the person with small instance number is guaranteed.
For the number of Vote File itself, Oracle also has a very important rule: when the number of Vote File obtained by the cluster node is less than N Vote File, the node will be expelled from the cluster by the cluster, and N is the total number of clusters. That is to say, the number of Vote File received by all nodes in the cluster must be greater than or equal to Nbig 2pl.
According to the above results, for the planning of VOTE disk groups, the even number of arbitration disks in the first place will inevitably cause a waste. Then there is no need to choose an even number of arbitration disks. Then we consider the fault tolerance of the disk group disks. in order to ensure that we have at least one disk fault tolerance, we have at least three arbitration disks. In other words, for the planning of OCR disk groups, at least three fault-tolerant groups are guaranteed, and each fault-tolerant group has an arbitration disk.
For OCR, it belongs to the resource registration information of the cluster and is a prerequisite for the operation of the cluster. So be sure to ensure its high availability. Because it is configuration data, it must follow ORACLE ASM's disk redundancy policy (External, Normal, High). In other words, OCR can have one, two, and three mirrors in an OCR disk group. Each mirrored data stripe falls into a separate fault-tolerant group.
To sum up, for the planning of OCR disk groups, in order to ensure the fault tolerance of at least one copy of the arbitration disk and the high redundancy strategy of OCR data, we should plan at least three disks in the disk group, each of which falls into an independent fault tolerance group. Disk size is recommended to be above 1GB (although OCR Device = about 300m).
2.2 Design of storage external redundancy architecture
First of all, we must use multipath software to manage the path of Lun, and ensure that the link handover strategy is load balancing mode. The best number of links is 8 links. Moreover, it is necessary to ensure the redundancy of these 8 links in five levels: optical fiber port, optical fiber card, access switch, core switch and storage controller. For example, two dual-port optical fiber cards are connected to two different core switches through their own access switches, and the core switches are respectively connected to the front-end ports of the two storage controllers. In the zone configuration of the fiber switch, each host fiber port wwn and a storage front port wwn are configured in one zone, with a port ratio of 1:2, with a total of 8 zone. It is shown in a schematic way as follows:
2.3 Storage configuration parameters for NFS architecture
The ASM disk of Orace RAC can be the Lun implemented by the network storage architecture, of course, we can also use the file system or bare disk as the storage resource of the database. However, when mounting NFS volumes, there are several parameters that deserve our attention.
1) Hard/soft: when the application process sends a request, in the case of Hard, the client will not notify the application immediately when it encounters an error, but will retry in the background until normal, which will cause the application process to block; in the case of Soft, the client will immediately notify the application that the application is suspended. In this sense, Soft is more responsive to applications than Hard, but if the network is unstable, Soft may cause application data to be corrupted. This is why Oracle recommends setting this parameter to Hard.
2) Rsize/Wsize: the maximum number of files read and written by the client from the server (byte) / per request. If this parameter is not set, it is done through client-side and server-side negotiation. It is generally recommended to set it to a fixed 32768.
3) Timeo: 600 (60 seconds) is recommended.
4) Intr/nointr: whether the interrupt signal of file operation is allowed, generally speaking, set to nointr.
5) in the case of Noac/ac: ac, the client will cache the file attribute information, thus improving the read performance of the client. In the case of Noac, the client does not cache file attribute information, and read in any case is the real-time version information of the file on the NFS file system. In the case of Ac, the client regularly scans the real-time information of the file on the server side, and reads its own cached information at other times. As the storage disk of the database, the NFS volume only needs to reflect the real version information of the file in real time, and there is no need for the client to do the caching. The database has its own caching mechanism. Therefore, in general, Oracle recommends setting this parameter to Noac.
Of course, these parameters will vary according to the characteristics of different operating systems. Table 2. 3 is an excerpt from the official NFS storage best practices parameter table published by Oracle and provides a general reference.
2.4 ASM disk Group Plannin
(1) disk group correlation
In addition to OCR disk groups, it is generally recommended to establish no more than 2 disk groups, one is the data disk group where the data is stored (+ DATA), and the other is the flashback area disk group (+ FRA) where the logs are stored. Assuming that we choose Normal as the redundancy policy for disk groups, it is recommended that the number of disks be even and at least 4 disks of the same size with the same performance configuration. On the one hand, considering the redundancy of 2 copies, and on the other hand, ensuring the striped distribution of the number in the Failure Group can ensure the read and write performance of the disk groups. If it is another redundancy strategy, follow the same idea to select the number of disk groups. In addition, the size of Lun can not exceed 2T (easy to cause ORA-15196, ORA-15099 problems).
(2) disk allocation unit and file stripe
AU is the ASM Disk Group disk space allocation unit. Strip is actually a stripe at the file level, which should be an extension of a file, to be exact. For the extension block of a file, it is the unit of file cutting. It has two modes (coarse & fine). For coarse mode, the extension block size is equal to the AU size, and the corresponding parameters are fixed (_ asm_stripesize=AU,_asm_stripewidth=1). For the fine model, the extension block size can be adjusted according to our business needs. For example, if it is set to 256K, then the original 1m file is written in the AU on one disk, and now it can be written in parallel to the four AU on the given disk. It gives full play to the parallel read and write performance of small IO. But for some large IO database services, the AU can be appropriately adjusted to 4m and the large page read and write parameters of the operating system are enabled. The file extension keeps the corse mode. For general OLTP services, data files and archive files are generally set to corse;, while redo logs, control files and flashback logs are set to fine. For oracle after 11g, these parameters basically do not need to be adjusted actively, unless there is a performance problem related to it.
2.5 ASM memory management parameters
(1) memory parameters are related
db_cache_size: buffer, buffer cache for storing metadata blocks. The recommended value is 64m.
shared_pool: memory pool required to manage ASM instances. The recommended value is 128m.
Large_pool: used to store extent maps. The recommended value is 64m.
(2) other parameters are related.
In 11g, if multiple databases share ASM instances, it is recommended that the number of process settings be calculated according to the following rules.
ASM processes = 25 + (10 + max (possible concurrent data file changes) * the number of databases. Of course, this number requires an empirical assessment, which needs to be estimated based on the situation of the cluster environment database and the judgment of the business IO.
2.6 Asynchronous IO configuration
Generally speaking, database applications should enable asynchronous IO to improve the IO performance of the database. You need to turn on both the asynchronous IO parameter of the operating system and the asynchronous IO parameter of the database. Take Linux as an example, the parameter aio-max-nr=1048576 (set to 4194304 in 11g) needs to be set at the operating system level, which indicates the number of asynchronous IO requests that can be owned at the same time. Then set the following two parameters at the Oracle database level: filesystemio_option=setall;disk_asynch_io=true. For AIX, the following three parameters (aix_maxservers, aix_minservers,aio_maxreqs) need to be set. For OLTP business, the official recommended value of IBM is (800, 200, 16384).
The above parameter values are only a general reference, but the specific configuration values of the parameters mentioned above still need to be evaluated according to the data of your own environment. For example, we need to pay attention to the io wait in iostat and a series of indicators of aio to determine whether the setting value is scientific or not.
2.7 ASMLib & Udev
For the Linux platform, there are three ways of ASM disk management for Oracle RAC (ASMlib, DM, udev). Our preferred way is ASMlib. For RHEL6 (starting from 6.4), the kernel driver package 'kmod-oracleasm' has been launched on the Redhat platform and can be installed through the "RHEL Server Supplementary (v. 6 64-bit x86x64)" channel on RedHat Network (RHN). Updates to this module will be provided by RedHat.
In the case of ASMlib, it creates an ASM disk with the following command:
# / usr/sbin/oracleasm createdisk disk_name device_partition_name
The ASM disk group name (disk_name) created in this way is only bound to the subsequent device_partition_name, so we must ensure that the operating system cannot change this name during future Lun changes. Assuming that we are using the third-party multipath software management method, then we need to use the multipath management software to associate the device_partition_name of the disk with the unique ID of the disk. For example, emcpowerpath can use emcadm export/import to ensure that the Lun names on Rac nodes are consistent.
For the udev approach, we need to associate the scsi-id of the disk with the resulting asm disk name, rather than the device name that the disk displays in the operating system. For example:
KERNEL== "sd*", BUS== "scsi", PROGRAM== "/ sbin/scsi_id-g-u-s% p", RESULT== "36000c29a972d7d5fe0bf683b21046b34", NAME= "asmgrid_disk1", OWNER= "grid", GROUP= "asmadmin", MODE= "0660"
The PROGRAM field is very important, which indicates how we associate the disk correspondence between the operating system and the ASM. If the disk device name changes with the increase or decrease of the disk or the restart of the server in the future operation and maintenance process, it will lead to the disorder of ASM disk symbols, and finally the database cluster cannot be started. Of course, this problem does not exist after 11g, because after 11g, ASM automatically matches the disk ID identified by ASM according to the unique ID of the disk, and the read and write of the node is performed according to the ASM tag of the disk. However, from the perspective of management science, best practices should be followed to ensure that there is no risk.
2.8Storage parameters of concern on AIX platform
"for AIX platforms, the system parameters for storage volumes must follow the following rules."
(1) reserve_lock, reserve_policy
Both parameters actually indicate whether the operating system holds a shared lock on the storage volume. The storage array type is SSA, FAStT, or other non-MPIO-capable disks, and the parameter setting refers to A. The storage array type is SS, EMC, HDS, CLARiiON, or other MPIO-capable disks, and the parameter setting method refers to B.
A. # chdev-l hdiskn-a reserve_lock=no
B. # chdev-l hdiskn-a reserve_policy=no_reserve
(2) before a disk joins an ASM disk group, it must clear its head PVID information. Otherwise, it will result in disk errors such as ORA-15063, ORA-15040, ORA-15042, etc.
(3) fc_err_recov.
This parameter indicates the switching time of read and write errors in the case of broken optical fiber on AIX platform. Normally, this switch causes the database IO to hang for 10 minutes. If it is Vote disk, it will cause the cluster to restart. In order to avoid this kind of situation, you need to set the value of this parameter to fast_fail to achieve fast switching.
(4) max_transfer.
This parameter is recommended to set at least the IO size of the maximum request of Oracle, which is generally more than 1m.
(5) queue_depth.
This parameter represents the maximum IO queue depth of the Lun, and this parameter must be set to support the database concurrent read and write load.
(6) max_xfer_size.
This parameter represents the maximum transmission size of the optical fiber card. The setting of this parameter must maintain a multiple relationship with the disk throughput parameters, and must be greater than the disk setting parameters.
(7) num_cmd_elems.
This parameter represents the maximum number of IO requests accepted by the optical fiber card. This parameter also has a multiple relationship with the queue_depth of the disk. The specific value depends on the number of Lun contained by the optical fiber card in the environment.
3. Key points of Network Planning and Design 3.1 hardware and parameters
From the official recommendation of Oracle, they first recommend the use of 10 Gigabit Ethernet, at least Gigabit Ethernet, if the load is very high, then the private network can use infiniband. Of course, it all depends on the specific business volume and load of the customer's production environment. This is just a reference and can be configured as recommended if possible. A switch is required to connect to the private network. Oracle cluster installation does not support the directly connected architecture of the private network. The dual attack rate parameters of the network card and the switch are correct and consistent.
3.2 Network card binding
Various platforms have their own Nic binding tools, and provide binding between load balancer and active / standby modes. First of all, in order to improve the high availability of public and private networks, the network card needs to be bound. For Linux platform, we need to count mode in the configuration file "/ etc/modprobe.d/dist.conf" to control the specific policy of Nic binding:
mod=0, namely: (balance-rr) Round-robin policy (balanced cycle strategy).
mod=1, that is: (active-backup) Active-backup policy (master-backup policy).
mod=2, namely: (balance-xor) XOR policy (balancing strategy).
mod=3, that is, broadcast (broadcast policy).
mod=4, namely: IEEE802.3ad Dynamic link aggregation (IEEE802.3ad dynamic link aggregation).
mod=5, namely: (balance-tlb) Adaptive transmit load balancing (Adapter Transport load balancing).
mod=6, namely: (balance-alb) Adaptive load balancing (Adapter Adaptive load balancing).
For private network Nic binding method mode=3&6 will lead to ORA-600, while public network Nic binding method mode=6 will lead to BUG9081436. For the specific binding mode, for situations where the platform version is low and the network architecture is very complex, the active / standby mode is recommended, because the active / standby mode is more stable and is not easy to cause inconsistent packet paths. In the case of load balancing mode, if the network parameters are not set scientifically, it is easy to send a message from one physical network card, but the reply message is returned to another physical network card. After the network link is added to the firewall rule, it is very easy to cause packet loss.
For the AIX platform, change the parameter mode to a NIB or Standard value. Standard determines which physical network card is used to send messages according to the destination IP address. It is a load balancing based on IP address, and it is not easy to cause the above packet loss problem.
3.3 SCAN
Oracle RAC, which adds the feature of SCAN (Single ClientAccess Name) after 11gr2.
SCAN is a domain name, which can resolve at least 1 IP and up to 3 SCAN IP. The client can access the database through this SCAN name. In addition, SCAN ip must be in the same subnet as public ip and VIP. After SCAN is enabled, a virtual service layer is added between the database and the client, namely SCANIP and SCANIP Listener. The client only needs to configure the tns information of SCANIP and connect to the backend cluster database through SCANIP Listener. In this way, no matter whether the cluster database has the operation of adding or deleting nodes, it will not affect the client, and there is no need to modify the configuration. For SCAN-related configurations, there are some configuration considerations:
(1) the default gateway of the host must be on the same subnet as SCAN and VIP.
(2) it is recommended to resolve the SCAN name (11gR2 and later) to at least 3 IP addresses through DNS, regardless of the size of the cluster.
(3) in order to avoid problems in name resolution, suppose we set three SCAN addresses, then the record of SAN cannot appear in the HOSTs file, because the record in the HOSTs file is static parsing, which is contrary to DNS dynamic parsing.
3.4 Network parameters
The kernel parameters of the network on the operating system platform are very important, which directly determine the stability and performance of private and public network data transmission. However, for different operating systems, the relevant parameter settings are also different.
1.Linux
For the kernel parameters of the Linux platform, there are two very important (net.core.rmem_default, net.core.rmem_max). The specific functions are explained as follows:
net.ipv4.conf.eth#.rp_filter: packet reverse filtering technology.
net.ipv4.ip_local_port_range: represents the range of IPv4 ports that the application can use.
net.core.rmem_default: the default value that represents the size of the socket receive buffer.
net.core.rmem_max: represents the maximum size of the socket receive buffer.
net.core.wmem_default: the default value that represents the size of the socket send buffer.
net.core.wmem_max: represents the maximum size of the socket send buffer.
In order to achieve better network performance, we need to adjust the above two parameters from their default values to 2-3 times or even higher according to the specific situation, and turn off or set the reverse filtering function to disable 0 or loose mode 2.
2.AIX
For the kernel parameters of the AIX platform, the following settings are the best configuration extracted from the official Oracle documentation:
Tcp_recvspace = 65536 * tpaired Sendspace = 65536
Udp_sendspace = (db_block_size * db_multiblock_read_count) + 4096)
Udp_recvspace = 655360
Rfc1323 = 1
Sb_max = 4194304
Ipqmaxlen = 512
The first and second parameters represent the TCP window size, and the third and fourth parameters represent the UDP window size. Rfc1323 enables window scaling and time icons specified by RFC 1323 (TCP extensions for high performance). Window scaling allows TCP window sizes (tcp_recvspace and tcp_sendspace) to be larger than 64KB (65536) and are commonly used in large MTU networks. The default is 0 (off). If you try to set tcp_sendspace and tcp_recvspace to greater than 64 KB, you need to change this value to 1 first. Ipqmaxlen indicates the specified number of received packets, which can be listed in the IP protocol input queue. Sb_max specifies the maximum buffer size allowed by a TCP and UDP socket.
3.5 Security configuration items
The firewall under the 1.Linux platform needs to be turned off, otherwise it will cause communication problems in public or private networks.
# chkconfig iptables stop
The selinux security configuration item under the 2.Linux platform needs to be closed, and the configuration file is / etc/security/config.
SELINUX=disabled
3. If it is the AIX platform under the PowerVM virtualization architecture of the Power System host, if a large number of packet losses are found between the two nodes of the Oracle RAC, or the following events:
Cache Fusion "block lost"
IPC Send timeout
Instance Eviction
SKGXPSEGRCV: MESSAGE TRUNCATED user data nnnn bytes payload nnnn bytes
Then we need to check the patch information of the VIOS partition operating system. If there is no APAR IZ97457, then we need to put this patch on. For details, you need to go to the IBM website to find the corresponding patch and its detailed explanation.
3.6 General considerations
1. System hostname, domain name and other configurations are not allowed to be underlined.
two。 The name of the Nic is the same on both nodes (for example: public- > eth2 contacts 1 eth3 private-> network cards 2).
3. The name of the network card device cannot contain "." And other special characters.
4. The private network address should comply with the RFC1918 standard and adopt the three types of ABC internal private network address specified by it. Otherwise, BUG4437727 will occur. Class A: 10.0.0.0-10.255.255.255 (10ax 8-bit prefix); Class B: 172.16.0.0-172.31.255.255 (172.16-bit prefix); Class C: 192.168.0.0-192.168.255.255 (192.168-debit 16-bit prefix). And the mapping relationship between the private network VLAN and the above no-routeable subnet needs to be 1:1 to avoid causing BUG9761210.
5. From 11gr2, the configuration of private network segment needs to support multicast function, because private network needs to communicate through multicast mode.
3.7 send (tx) / receive (rx)
During the transmission of a UDP packet, the receiving process reads the check value of the packet header. Any corruption of the check value will cause the packet to be discarded and cause a retransmission, which increases CPU usage and slows packet processing.
Due to the checksum error caused by the Checksum offloading enabled on the network card, if there is such a problem, please check whether the function of checksum offloading is disabled, and consider turning off this function on the network card after testing. Execute ethtool-K rx off tx off on the Linux system to turn off this feature.
3.8 MTU
Mismatched MTU size settings can cause "packet too big" errors and packet loss during transmission, resulting in global cache block loss and a large number of retransmission requests. And inconsistent MTU values in the private network will cause nodes to be unable to join the cluster.
For Ethernet, the default value for most UNIX platforms is 1500 bytes. All devices in the private network link should have the same MTU defined. Please confirm and monitor all devices in the private network link. Specify a large, non-default size for the ping, tracepath,traceroute command, ICMP probe package to check for inconsistencies in MTU settings. Set the appropriate value for the MTU of the server network card (NIC) using ifconfig or a tool recommended by the manufacturer.
Jumbo Frames is not a standard configuration for IEEE. The size of a single Jumb Frame is about 9000 bytes. The size of the Frame depends on the network equipment vendor and may be inconsistent on different communication devices. If the default MTU size is not 9000bytes, make sure that all devices in the communication path (for example, switches / network devices / network cards) can support a uniform MTU value, which must be configured for Frame Size (MTU Size) during operation. Inappropriate MTU settings, for example, MTU=1500 is configured on the switch, but the private network card on the server is configured as MTU=9000, which will cause packet loss, packet fragmentation and reorganization errors, which can lead to serious performance problems and abnormal node downtime. On most platforms, we can find package fragments and reassembly errors through the'IP stats' output of the netstat-s command. On most platforms, we can find the settings of frame size through the ifconfig-a command. For configuration queries on the switch, you need to check the switch provider's documentation to determine.
4. Key optimization items at the operating system layer
4.1 compatibility check
Before the construction and implementation of Oracle, the compatibility of the relevant database technologies to be adopted is checked according to the operating system platform. The following link is the official Matrix for Linux platform and Unix platform respectively:
Http://www.oracle.com/technetwork/database/clustering/tech-generic-linux-new-086754.html
Http://www.oracle.com/technetwork/database/clustering/tech-generic-unix-new-166583.html
4.2 platform version and patches
After we have selected the specific operating system platform and the specific database version, the next thing we need to do is to check whether our system patches and related software packages are complete and accurate according to the official documentation:
Https://support.oracle.com/epmos/faces/DocumentDisplay?_afrLoop=184026698780346&parent=DOCUMENT&sourceId=1526555.1&id=169706.1&_afrWindowMode=0&_adf.ctrl-state=bjsizj5t_240
Https://support.oracle.com/epmos/faces/DocumentDisplay?_afrLoop=184190072886338&parent=DOCUMENT&sourceId=1526555.1&id=1393041.1&_afrWindowMode=0&_adf.ctrl-state=bjsizj5t_338
Https://support.oracle.com/epmos/faces/DocumentDisplay?_afrLoop=184287678146090&parent=DOCUMENT&sourceId=1526555.1&id=282036.1&_afrWindowMode=0&_adf.ctrl-state=bjsizj5t_436
Oracle official list of recommended patches for all database clusters, RDBMS, etc.:
Https://support.oracle.com/epmos/faces/DocumentDisplay?_afrLoop=202212137729856&parent=DOCUMENT&sourceId=1526083.1&id=756671.1&_afrWindowMode=0&_adf.ctrl-state=yi6z8ecqc_839#r11204
In addition, there is one thing that can not be ignored on the AIX platform, that is, it is necessary to ensure that Hacmp is not installed or there are no residual traces.
4.3 time synchronization setting item
NTP is a necessary option before 11gr2 to synchronize time between nodes. When it comes to 11gr2, you can use not only NTP but also CTSSD (Cluster Time Synchronization Daemon) instead of NTP. If NTP is enabled, Oracle will automatically be in watch mode with NTP,CTSSD. But for NTP's enable mode, we need to use progressive mode (we need to use the startup parameter-x). You can refer to the following configuration:
1. / etc/sysconfig/ntpd
# Drop root to id 'ntp:ntp' by default.
OPTIONS= "- x-u ntp:ntp-p / var/run/ntpd.pid"
# Set to 'yes' to sync hw clock after successful ntpdate
SYNC_HWCLOCK=no
# Additional options for ntpdate
NTPDATE_OPTIONS= ""
4.4 ASLR (Address Space Layout Randomization)
ASLR is a feature turned on by default in REL5 Linux and above, and it is a computer security technology involved in protecting buffer overflows. Is to prevent attackers from being able to reliably jump to specific utilization functions in memory. ASLR includes randomly arranging the locations of key data areas of the program, including executable parts, heaps, stacks, and shared libraries. ASLR hinders some types of installation attacks by making it more difficult for attackers to predict target addresses. In the process management of ORACLE, multiple processes share the same address of shared memory. When the ASLR feature is enabled, Oracle cannot guarantee the availability of shared memory. This causes an ORA-00445 error to occur. To disable this feature on Linux, you need to add or modify the following two parameters to the / etc/sysctl.conf file:
Kernel.randomize_va_space=0
Kernel.exec-shield=0
4.5 HugePage
Most operating systems are managed in segments or pages. Segmentation is a coarse-grained management mode, while paging is a fine-grained management mode, which can avoid the waste of memory space. Accordingly, there is the concept of physical address and virtual address of memory. In the previous two ways, CPU must translate the virtual address into the physical memory address in order to actually access the memory. To improve this conversion efficiency, CPU caches the mapping of the nearest virtual memory address to the physical memory address and stores it in a mapping table maintained by CPU. In order to improve the access speed of memory as much as possible, it is necessary to save as many mapping relationships as possible in the mapping table.
The memory management of Linux adopts the paging access mechanism. In order to ensure that the physical memory can be fully utilized, the kernel will automatically swap the infrequently used pages in the physical memory to the virtual memory according to the LRU algorithm at the appropriate time, and keep the frequently used information to the physical memory.
In general, Linux defaults to 4K per page, which means that if physical memory is large, there will be a large number of entries in the mapping table, which will affect the retrieval efficiency of CPU. Because the memory size is fixed, the only way to reduce the number of entries in the mapping table is to increase the page size. That's why Hugepage came. That is to break the traditional memory management of small pages, using large pages of 2m and 4m, etc. As a result, the number of mapping entries is significantly reduced. If the system has a large amount of physical memory (greater than 8 gigabytes), the physical 32-bit operating system or 64-bit operating system should use Hugepage.
So how do you control the use of HugePage by Oracle databases? Mainly divided into the following steps:
1. You need to set the memlock value (in KB) in / etc/security/limits.conf, which is less than the memory size.
2. If you use 11G and later versions, AMM is enabled by default, but AMM is not compatible with Hugepages, so you must first turn off AMM.
3. Edit / etc/sysctl.conf to set the specific value of the vm.nr_hugepages parameter.
4. Stop the instance and restart the OS system, and check whether the settings are in effect by using the following command:
# grep HugePages / proc/meminfo
HugePages_Total: 1496
HugePages_Free: 485
HugePages_Rsvd: 446
HugePages_Surp: 0
5. Use the following parameters to control how the database uses HugePage (after 11gr2):
Use_large_pages = {true/only/false/auto}
The default value is true. If the system sets Hugepages, SGA will give priority to using hugepages as much as it can. If set to false, SGA will not use hugepages. If set to only, the database instance cannot be started if the hugepages size is not enough. Set to auto, this option triggers the oradism process to reconfigure the linux kernel to increase the number of hugepages. It is generally set to true.
SQL > alter system set use_large_pages=true scope=spfile sid='*'
4.6 Transparent HugePage
Transparent large page management and the standard large page management described above are new features released by the operating system in order to reduce the resources consumed by page table conversion. Although Oracle recommends the use of large page mechanism to improve database performance, Oracle also recommends turning off transparent large page management. The difference between the two lies in the allocation mechanism of large pages, standard large page management is the way of pre-allocation, while transparent large page management is the way of dynamic allocation.
For the database, this dynamic allocation mode is very likely to cause serious performance problems in the case of high system load. This is detailed in the official Oracle documentation 1557478.1.
So how to turn off the transparent large page management of the system? You can simply modify the following parameters:
# echo never > / sys/kernel/mm/transparent_hugepage/enabled
4.7 vm.min_free_kbytes
This parameter is used in the Linux kernel to control the amount of minimum free memory reserved, and Oracle recommends that you increase this value to 512m. Such a setting helps to relatively speed up the recycling of memory, thereby reducing the pressure of tight memory.
4.8 AIX virtual memory parameters
This item is mainly for memory management of AIX. AIX memory management is different from Linux's memory management mechanism, it uses computational memory and non-computational memory to manage memory. IBM's recommended values for Oracle are the following scenarios:
Minperm%=3
Maxperm%=90
Maxclient%=90
Lru_file_repage=0
Lru_poll_interval=10
Strict_maxperm=0
Strict_maxclient=1
Page_steal_method=1
Minperm and maxperm control the lower and upper limits of file pages in non-computing memory; maxclient controls customer pages in non-computing memory; and lru_file_repage indicates that the paging replacement daemon will determine which type of paging is selected based on its internal re-paging table. Strict_maxperm&strict_maxclient said that the maximum share of file pages and customer pages should not exceed the limit, regardless of whether there is free memory or not. Page_steal_method indicates the policy when changing pages. 0 is all pages and 1 is non-computing persistent pages. For the above parameters, the upper limit of persistent pages is 90% and the lower limit is 3%; the upper limit of client pages is 90%; the upper limit of non-strict persistent pages is adopted; and the upper limit of client pages is strictly controlled. The combination of lru_file_repage and page_steal_method means that when LRUD is looking for free pages, it is only looking for persistent memory pages in expensive computing memory.
Vmm_klock_mode=2
This parameter controls whether the kernel page is locked or not. 0 means no locking, then the kernel page may be swapped out incorrectly, resulting in Page Fault; 1 indicates that some kernel pages are locked; 2 indicates that all kernel pages are locked. IBM strongly recommends that this parameter be set to 2 in an Oracle RAC environment or when the storage of EMC is the Swap Device of the system.
4.9 Parameter adjustment in PowerVM environment
In PowerVM environment, because it uses Hypervisor to achieve a lot of virtualization functions, most of these functions are considered in terms of flexibility and extension, but if we are running Oracle Rac, then there are still many key points to pay attention to.
1. Cpu folding
Virtual processor folding function, when the system load is low, the AIX system automatically hibernates some virtual processors to reduce the overhead of Hypervisor and improve the overall performance of the PowerVM platform. But in some cases, when the load of the database changes very fast, the speed of CPU folding or opening will affect the performance of the database or even the system, seriously causing the system to hang. Below is a patch from IBM for this BUG, mainly for AIX5.3 & 6.1.
Https://www-01.ibm.com/support/docview.wss?uid=isg1fixinfo105201
4.10 Linux kernel parameters
The following is a general solution officially set by Oracle for the kernel parameters of Linux platform:
Fs.aio-max-nr = 1048576
Fs.file-max = 6815744
Kernel.shmall = 2097152
Kernel.shmmax = 4294967295
Kernel.shmmni = 4096
Kernel.sem = 250 32000 100 128
Net.ipv4.ip_local_port_range = 9000 65500
Net.core.rmem_default = 4194304
Net.core.rmem_max = 4194304
Net.core.wmem_default = 4194304
Net.core.wmem_max = 4194304
Kernel.shmmax: is one of the most important core parameters that defines the maximum value of a single shared memory segment. The setting should be large enough to accommodate the entire SGA under one shared memory segment, and a low setting may result in the need to create multiple shared memory segments, which may lead to system performance degradation.
The main reason for the system decline is that when the instance is started and the ServerProcess is created, multiple small shared memory segments may lead to a slight degradation of system performance at that time (multiple virtual address fields need to be created at startup, and there will be some impact when the process is created to let the process "identify" multiple segments), but it will not be affected at other times.
64-bit linux system: the maximum preferred value is physical memory value-1byte, and the recommended value is more than half of physical memory. Generally, the value is greater than SGA_MAX_SIZE, and you can take physical memory-1byte. For example, if it is 12 gigabytes of physical memory, the preferred 210241024 1024-1 12884901887 Magi SGA will definitely be included in a single shared memory segment.
Kernel.shmall: this parameter controls the total number of pages of shared memory that can be used. The page size of Linux shared memory is 4KB, and the size of shared memory segments is an integral multiple of the size of shared memory pages. The maximum size of a shared memory segment is 16G, so the number of pages of shared memory needs to be 16GB/4KB=16777216KB / 4KB=4194304 (pages), that is, 16GB physical memory under 64Bit system. Set kernel.shmall = 4194304 to meet the requirement (almost double the original setting of 2097152). At this time, you can adjust the shmmax parameter to 16G, and modify SGA_MAX_SIZE and SGA_TARGET to 12G (the maximum size of SGA you want to set, of course, can also be 2G~14G, etc., and coordinate the use of PGA parameters and other memory such as OS, which cannot be set too full, such as 16G).
Kernel.shmmni: this parameter is the maximum number of shared memory segments. The default value of shmmni is 4096, which is generally sufficient.
Fs.file-max: this parameter determines the maximum number of file handles allowed in the system, and the file handle setting represents the number of files that can be opened on the linux system.
Fs.aio-max-nr: this parameter restricts the concurrency of outstanding requests and should be set to avoid subsystem failures.
Kernel.sem: take kernel.sem = 32000 32000 128 as an example: 250 is the value of the parameter semmsl, indicating the maximum number of semaphores that can be contained in a semaphore set. 32000 is the value of the parameter semmns, indicating the maximum number of semaphores allowed in the system. 100 is the value of the parameter semopm, which represents the number of operations that a single semopm () call can perform on a semaphore set. The value of the parameter semmni represents the total set of system semaphores.
Net.ipv4.ip_local_port_range: represents the range of IPv4 ports that the application can use.
Net.core.rmem_default: the default value that represents the size of the socket receive buffer.
Net.core.rmem_max: represents the maximum size of the socket receive buffer.
Net.core.wmem_default: the default value that represents the size of the socket send buffer.
Net.core.wmem_max: represents the maximum size of the socket send buffer.
5. Configure the key points of the cluster layer
5.1 diagwait
When the cluster expels the node and the node restarts, the operating system needs to dump CrashDump, which is the time parameter for the cluster to wait for the operating system dump. For versions of 10gR2 and 11gR1, the best practice on all platforms is to set the CSS diagwait parameter to less than or equal to 13. Setting this parameter is no longer required for 11gR2.
5.2 ORA_CRS_HOME
For this variable, Oracle's official advice is: do not set the ORA_CRS_HOME environment variable (on all platforms). Setting this variable will cause problems with individual Oracle components, and the CRS program does not need this variable at all because they all have wrapper scripts.
5.3 about Multicast and Multicast
The Oracle11gR2 private network uses multicast communication, and the multicast address segment is 230.0.1.0 or 224.0.0.251. In order to ensure the normal and stable communication of the cluster private network, on the one hand, the network configuration of this network segment needs to support multicast, on the other hand, do not let any other network configuration of the host conflict with the two network segments. An address of out-of-band management of an X86 server that had encountered a wave also took advantage of this network segment, resulting in an abnormal restart of the Oracle Rac node. This problem will generally be shown as an error in private network communication, but under normal circumstances, as long as the out-of-band management log collection tool is not started, then this problem will not be found, which is an occasional problem. So we need to be very careful about this kind of problem, and don't wait until it happens to spend a lot of practical investigation.
6. Key points when installing a cluster
6.1 cluvfy
If you have installed a RAC environment after 10g, you should be familiar with this tool. The runcluvfy.sh script is usually executed to check that the current system meets the installation conditions before installing Cluster and Database. Runcluvfy.sh implements the functionality of the cluvfy tool in shell, allowing users to take advantage of the tool's functionality before the database and CLUSTER are installed. The main function of this tool is to verify whether the system meets the installation conditions, especially to check the network and domain name resolution. If the network and domain name resolution are not properly configured before installation, it usually leads to the failure of the installation. The main uses are as follows:
# runcluvfy.sh stage-list
# runcluvfy.sh comp-list
6.2 upgrade sequence issu
After installing the cluster software or RDBMS, patching is a must. But do you want to patch all the components after you have installed them, or do you want each component to be patched in the order of installation-patch upgrade? Oracle recommends that when performing a pre-11gR2 installation, it is recommended that you apply a patch to upgrade the Clusterware home directory to the desired level before performing any installation of the RDBMS or ASM home directory.
6.3 root.sh & rootupgrade.sh
During installation or upgrade, the final step is to execute the above script. Of course, according to the requirements of the installation documentation, it must be executed with the superuser root. However, some people may think that it is OK to use the permission of root to execute, and they would like to use su root-c or sudo of course to execute, resulting in the failure of execution and an error crsd.bin crashes in the log. Because the execution of the script is not only to use the privileges of the superuser root, but also to make use of the environment configuration of the root user.
6.4 / etc/init.cssd
11.2 on AIX systems, OPROCD does not run on the AIX global run queue by default, which may cause OPROCD to restart the node incorrectly. (Bug 13623902). The correction for this problem is to modify the / etc/init.cssd file by adding the following parameters:
RT_GRQ=ON
Export RT_GRQ
However, versions of AIX6.1 TL4 and above come with a fix for the problem. So when we use a lower version of the AIX operating system, we need to pay special attention to this configuration document.
7. Key optimizations of the database tier
7.1 pre_page_sga & lock_sga
Both of these parameters are protection parameters for the database SGA. Lock_sga controls that SGA is not swapped out to swap space, which ensures that database memory pages remain consistently on physical memory, thereby improving performance. Pre_page_sga ensures that all SGA will be read to physical memory when the database instance starts. Although the startup will be slow, the subsequent performance will be good.
But pre_page_sga also has a role to play. When pre_page_sga is true, each process will go to touch the page in sga when it is created. The larger the sga, the longer the touch will take, especially on disconnected and short-connected Application, which will consume a lot of resources. When the client connection feels slow, this parameter must be set to false. Oracle's suggestion is also false.
SQL > alter system set pre_page_sga=false scope=spfile sid=''
SQL > alter system set lock_sag=true scope=spfile sid=''
7.2 about the redo log
First of all, people who have come into contact with the database are no stranger to this concept. When the database makes SQL updates, the transaction execution process must first be recorded in the redo log, and then the log will be brushed to disk to persist the data update. The standard log of a successful data submission falls to disk rather than the actual data. Therefore, the configuration (size and number) of logs directly determines the read and write performance of the database. If the log size is very large, it will result in a very long archive switching time. Once an unrecoverable DB disaster occurs at this time, then the amount of data lost or RPO recovered through backup will be large. If the log size is very small, it is bound to cause frequent log switching, and there are a large number of log switching events in AWR, which will have a great impact on the performance of the database. Therefore, according to the waiting events of log switching in the AWR report of the performance test, and the switching frequency to determine whether the amount of data and size need to be adjusted. General OLTP recommendations (10 groups, 500m).
Next, we also need to consider the parameter settings associated with it.
1. _ use_adaptive_log_file_sync
It directly determines the way the log is set up. For the way in which the data of the log buffer is set, 11g adds a new way that is polling, and the traditional way is post/wait. The oracle underlying layer automatically determines when and how to complete the write task of the lgwr process. For the post/wait approach, after the client has done the commit, it needs to wait for the event to complete. As soon as the oracle is completed, the user process is notified and the user process is immediately aware of it. But this notification post will consume a lot of CPU resources. Polling is an oracle foreground process startup check task, which automatically checks background lgwr writes, which costs less CPU resources, but the user process may not be aware of it immediately.
So the two methods have their own advantages. But the key is that the implementation of the two switching methods in the background will consume system performance, especially in busy times, frequent switching will lead to database performance degradation. There are a large number of log file sync,Bug 13707904 in awr.
SQL > alter system set "_ use_adaptive_log_file_sync" = false scope=spfile sid='*'
2. Archive_lag_target it determines whether we enable the log forced switching feature. In order to reduce data loss in case of failure, you can set the archive_lag_target parameter to force log switching. The default value for this parameter is 0, which means that the parameter is not enabled. It is recommended to set the value to 1800. SQL > alter system set archive_lag_target=1800 scope=spfile sid='*'
7.3 AMM
First of all, there are two general memory management methods for ORACLE: AMM&ASMM. Starting from Oracle 11g, ORACLE uses AMM (automatic memory management) by default, that is, it allows the database to fully manage the size of SGA and PGA, while for administrators only need to set a total size (memory_target), the database will dynamically adjust the size of SGA and PGA and the size of various components, such as Database buffer cache, Shared pool, and so on. This feature is designed to avoid performance problems caused by uneven memory usage caused by incorrect SGA and PGA settings.
But in the process of practical application, is this feature necessarily excellent? In AMM, when the database is started, there is a fixed proportion to allocate the SGA/PGA size:
Sga_target = memory_target 60%
Pga_aggregate_target=memory_target * 40%.
However, when the concurrency is high and the database is very busy, the speed of automatic memory adjustment may not be as fast as the memory requests of a large number of sessions. In addition, when the demand for PGA increases sharply with the increasing number of sessions, it will first preempt SGA, resulting in database performance failure. AMM is not recommended in highly concurrent database scenarios. Adopt 10g more mature automatic shared memory management (ASMM) and automatic PGA management. Adjust memory parameters manually. For more information, please see the following:
1. Turn off automatic memory management
SQL > alter system set memory_target=0 scope=spfile sid=''
SQL > alter system set memory_max_target=0 scope=spfile sid='*'
two。 Set SGA to a fixed value, according to the recommendations in the AWR report in the performance test
SQL > alter system set sga_max_size=XG scope=spfile sid=''
SQL > alter system set sga_target=XG scope=spfile sid=''
3. Set parameters such as PGA
SQL > alter system set pga_aggregate_target=XG scope=spfile sid=''
SQL > alter system set large_pool_size=256M scope=spfile sid=''
Pga_aggregate_target=XG
Large_pool_size=256M
Another very important parameter, "_ shared_pool_reserved_pct", if this parameter is set too small, it is likely to cause ORA04031, so a reasonable setting is needed.
SQL > alter system set "_ shared_pool_reserved_pct" = 10 scope=spfile sid='*'
7.4 SQL parsing
1. Binding variable snooping
In Oracle, each SQL language needs to be parsed before execution, which is divided into soft parsing and hard parsing. There are two types of SQL statements in Oracle, one is DDL statements (data definition language), which are never shared, that is, hard parsing is required for each execution. Another category is DML statements (data manipulation language), which choose either hard parsing or soft parsing according to the situation. Generally speaking, we want less hard parsing and more soft parsing in our AWR report. Because hard parsing can be very expensive. In order to reduce the parsing time of sql with bound variables, oracle 9i introduces the function of binding variable snooping. That is, the same cursor is used when the variables of the same SQL are assigned to different values, which saves the parsing time of the sql. Have you actually observed the actual situation of the number of soft and hard parsing in AWR by turning the function on or off? In fact, for binding variable snooping and later adaptive cursors and other features, are some new features enabled by oracle in order to find the optimal execution plan, but in the process of practical application, for different orders of magnitude and different characteristics of business scenarios, there have been a lot of bug (Bug 20370037) 13456573) and so on.
According to the characteristics of their own business system, do a large number of performance tests and business tests, according to the parameters of the off and open to compare the software and hardware resolution ratio shown in the awr report and the execution plan data to determine whether to open or related to the corresponding functional features. The parameters are as follows:
"_ optim_peek_user_binds"
"_ optimizer_adaptive_cursor_sharing"
"_ optimizer_extended_cursor_sharing"
"_ optimizer_extended_cursor_sharing_rel"
"_ optimizer_use_feedback"
2. Open_cursors & session_cached_cursors
Several related parameters: open_cursors and session_cached_cursors determine the number of cursors that can be opened and cached by the application session. If the number is insufficient, it will cause performance problems in SQL parsing. These two parameters should be adjusted according to the values in the v$resource_limit view to avoid performance problems caused by unreasonable resource settings.
SQL > alter system set open_cusors=N scope=spfile sid=''
SQL > alter system set session_cached_cursors=M scope=spfile sid=''
3. _ b_tree_bitmap_plans
Several parameters related to the execution of the parsing execution plan, _ b_tree_bitmap_plans, and sometimes BITMAP conversion of B-Tree indexes for SQL execution, often generate extremely bad execution plans, resulting in CPU100%. Select Fails With ORA-600 [20022] (document ID 1202646.1) suggests turning it off.
SQL > alter system set "_ b_tree_bitmap_plans" = false scope=spfile sid='*'
7.5 process & sessions
Process limits the number of operating system processes that can connect to SGA, and this total must be large enough to apply to background processes and all dedicated server processes, as well as the number of shared server processes and scheduling processes. Session is a context between the two parties from the beginning to the end of the communication. This context is a piece of memory on the server side: it records the client machine of this connection, which application is passed, which user is logging in, and so on.
The number of connections to Oracle sessions is related to the number of processes process in its parameter file, and their relationship is as follows: sessions= (1.1process+5)
The setting of these two parameters needs to determine the specific setting scheme according to the specific concurrent requirements of the application.
SQL > alter system set process=N scope=spfile sid=''
SQL > alter system set sessions=1.1N+5 scope=spfile sid=''
7.6 DRM
There are many competitions between database nodes, including the competition of locks (various granularity locks) and data transmission. Avoiding competition completely loses the meaning of RAC. RAC itself wants to be able to execute tasks in parallel on two nodes. If parallelism is particularly extreme, it must cause serious performance problems, and if it is completely banned, it will not be possible and lose the original meaning of the cluster. So we can only balance to a certain extent:
First of all, with regard to the DRM characteristics of DRM,oracle, in theory, it is to avoid the transmission of data between nodes and the frequent occurrence of lock waiting events between nodes. The ultimate goal of DRM is to unify the request node and the Master node. But in practice, this feature causes a lot of BUG, but leads to performance failure in the competition between nodes. Bug 6018125-Instance crash during dynamic remastering or instance reconfiguration (Doc ID 6018125.8). So it is recommended that it be closed.
SQL > alter system set "_ gc_policy_time" = 0 scope=spfile sid='*'
SQL > alter system set "_ gc_undo_affinity" = false scope=spfile sid='*'
7.7 parallel_force_local
With regard to the parameter "parallel_force_local", ORACLE RAC costs a lot to achieve multi-node parallel processing. Suppose there are three nodes in a cluster. For reading and writing a data block, there is a Master, a requester, and an owner. The requester requests the latest version of the data block from Master, and the Master forwards the request to the owner, and the owner transmits the data block to the applicant according to the request information. Then add a lock to read and write. This process requires a large number of data transmission and competition, once this thing becomes the majority, then it is bound to cause excessive communication load between nodes, resulting in a large number of lock waiting time, seriously affecting the overall performance of the database. Especially in situations where cross-data centers are highly available. Therefore, we only need to achieve business-level concurrency processing, rather than pursue an SQL-level absolute concurrency. This is the reason why things go to extremes. So turn on the parameters so that process-level concurrency can be localized rather than cross-node processing. This is one of the parameters that must be optimized in the official document ID 1536272.1.
SQL > alter system set parallel_force_local=true scope=spfile sid='*'
7.8 about automatic tasks
The Oracle 11g database has three predefined automatic maintenance tasks:
1. Automatic Optimizer Statistics Collection (automatic optimizer statistics collection): collects Optimizer (optimizer) statistics for all Schema objects in the database that have no statistics or only outdated statistics. QL query optimizer (SQL query optimizer) uses the statistics collected by this task to improve the performance of SQL execution.
2. Automatic Segment Advisor (automatic Segment Guide): identify segments that have available recycling space and make recommendations on how to eliminate debris in these segments. You can also run Segment Advisor manually to get more up-to-date recommendations.
3. Automatic SQL Tuning Advisor (automatic SQL Optimization Guide): check the performance of high-load SQL statements and make recommendations on how to optimize them. You can configure this guide to automatically apply the recommended SQL profile.
With regard to statistics collection, the database has its own default startup time, 11g is between 22:00 and 2:00, assuming this time conflicts with our batch time, we can modify the specific execution time of the machine. But this task must be retained. As for the other two optimization guidance, it depends on whether the probability of our actual work is very high and whether it is valuable to provide us with some optimization theoretical guidance. If it doesn't feel meaningful, you don't have to.
7.9 configuration optimization in security
First of all, do you want to keep the audit in the database? How to keep it?
If we do not open the audit, we will not be able to look for clues about the security issues in the future; if we do, it is likely that the audit log will take up a lot of storage space, or even affect the performance of the database IO. In general, we still need to audit some basic login behavior, but we can modify the log location to the operating system level to reduce the performance pressure on the database layer, and we should dump it on a regular basis. to reduce the extreme case of too many broken files and running out of I nodes in the file system. This optimization can be achieved by adjusting the parameter "audit_trail" and the adump parameter.
Next, the control parameters of the alert log and the trace file.
Max_dump_file_size, which determines the size limit of these files, is unlimited by default. If a very large file is generated, it will meet the file limit requirements of OS, resulting in write failure.
SQL > alter system set max_dump_file_size='100m' scope=spfile sid='*'
7.10 parallel_min_servers
This parameter determines the maximum number of processes that the instance can execute in parallel. The query process is set too small to have the ability to execute in parallel. If the setting is too large, it may lead to a shortage of memory resources, thus affecting the overall performance of the system. Be sure to monitor the number of active parallel server processes and calculate the average to be applied to the parallel_min_servers. This can be done by:
SQL > select * from v$pq_syssstat
View the column value "Servers Highwater"
Optimize the value of parallel_max_servers according to the hardware condition. You can start with (2 * (2 threads) (CPU_COUNT)) = 4 x CPU, and then repeat the test with test data for higher values. In general, OLTP systems need to limit it to no more than 128. the default algorithm for ORACLE is BUG. When the number of cpu cores exceeds 128and the default parallel parameter setting is too high, it is easy to trigger and the derivative oracle cannot be started. In addition, if this parameter is too high and the parallel process is too large, the derivative database cannot withstand the concurrency pressure.
SQL > alter system set parallel_max_servers=128 scope=spfile sid=''
7.11 fast_start_mttr_target & fast_start_parallel_rollback
Fast_start_mttr_target= {0-3600} once the specific value is set, crash recovery will be completed within this required time frame. Fast_start_parallel_rollback= {high/low/false}, high starts 4 times the number of parallel recovery processes as CPU, low starts 2 times as many parallel recovery processes as CPU, and false shuts down the parallel recovery process. These two parameters are used to accelerate the recovery in the case of failure, and the fundamental mechanism is to shorten the distance between the recent checkpoint and the online redo log by actively triggering checkpoint. But there is no doubt that this brings some performance risks. Therefore, the settings of these two values need to be set according to the specific business situation, and stress testing should be carried out at the same time, so as not to cause performance problems because of aggressive policies.
There is also a parameter log_checkpoints_to_alert related to this, which is off by default. Turning on this parameter records detailed checkpoint occurrence information in the trace file, which is an essential feature for database diagnostics. Therefore, it is recommended to open it.
SQL > alter system set fast_start_mttr_target=120 scope=spfile sid=''
SQL > alter system set fast_start_parallel_rollback=low scope=spfile sid=''
SQL > alter system set log_checkpoints_to_alert=true scope=spfile sid='*'
7.12 listener.ora
For listener prior to 11.2, you must first ensure that the IPC entry exists and that it is listed first in the address list of all RAC listener. Otherwise, it may adversely affect the time it takes for VIP to fail over in the event of a public network interface failure.
LISTENER_n1 =
(DEION_LIST =
(DEION =
(ADDRESS_LIST = (ADDRESS = (PROTOCOL = IPC) (KEY = EXTPROC))
(ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP) (HOST = n1vip) (PORT = 1521) (IP = FIRST))
(ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP) (HOST = 10.8.21.121) (PORT = 1521) (IP = FIRST))
)
)
Assuming that TCP is listed as the first item, then snooping needs to wait for TCP timeout, while TCP timeout is controlled by operating system parameters, and timeout takes a long time, then it is possible to cause problems that failover is not very timely.
7.13 sqlnet.ora
After the client of the database connection is disconnected abnormally, the corresponding possession is not released. For example, from the v$session view, you can still see that the corresponding session is in inactive, and the corresponding server process is not released, resulting in resources being occupied for a long time. How to deal with this situation? Sqlnet.expire_time provides a solution to this problem, which is specifically used to clean up those abnormal disconnected situations, such as abnormal network outage, abnormal client power outage, abnormal restart, etc. You can set the parameter sqlnet.expire_time=5 in the sqlnet.ora file.
7.14 Recycle Bin
Starting with ORACLE 10g, a concept called the Recycle Bin (Recycle Bin) was introduced. Its full name is Tablespace Recycle Bin. The Recycle Bin is actually a logical container. It is based on the existing allocated space in the tablespace, rather than physically delimiting a fixed area from the tablespace to be used as a recycle bin. This means that the recycle bin and objects in the tablespace share a storage area, and the system does not reserve space for the recycle bin. Therefore, when the table is DROP, if there is enough free space and the Recycle Bin is not cleaned, the objects dropped by DROP will always be in the Recycle Bin, but if the available space is tight, the database will overwrite the objects in the Recycle Bin according to the first-in-first-out order. The main advantage of the Recycle Bin feature is that there is a recovery mechanism when mistakenly deleting a table, which does not have to be realized through the database and avoids a large number of manual misoperations.
However, for business scenarios where there are a large number of drop objects but no purge operations, the opening of the Recycle Bin is bound to bring the risk of massive occupation of database system table space and database performance degradation, especially in the business environment of financial business batch. So in terms of the actual production environment, the use of this function is not as meaningful as theoretically imagined. Based on risk considerations, it is recommended that this feature be turned off.
SQL > show parameter recyclebin
SQL > salter system set recyclebin=off cope=spfile sid='*'
SQL > spurge recyclebin
SQL > spurge dba_recyclebin
7.15 result caching
The result cache is an added feature after 11g. There are two types: server and client. Server means that a piece of memory space is reserved in the Share Pool of SGA to store the query results of SQL or PLSQL, which can be shared by other processes. Client means that the results of SQL queries are kept in memory in the OCI connection process for use by all session in the process. The original intention is to improve the query efficiency. However, in the process of practical use, it has led to many problems, such as RC Latch, reliable message and other resource waiting problems (ID 1951729.1-Bug 19557279j12c has fixed this problem).
Therefore, in order to avoid more serious performance problems in the above situations, it is recommended that the result cache be turned off.
SQL > alter system set "_ optimizer_ads_use_result_cache" = false scope=spfile sid='*'
SQL > alter system set "_ result_cache_max_size" = 900scope=spfile sid='*'
7.16 undo
After the transaction is committed or rolled back, the corresponding undo data needs to be saved in the undo tablespace for a period of time due to the need for flashback or consistent reads, which is set by undo_retention. According to undo_retention, the inactive state of undo can be divided into two categories: EXPIRED,UNEXPIRED. The inactive undo rollback area beyond the undo_retention time in undo is called expired, and the inactive undo rollback area that is also within unod_retention time is called unexpired. Undo_retention does not mean that it must reach this time before it can be overwritten, but only an expected value. For example, if the undo tablespace file is set to non-automatic extension, when a large transaction needs to use up the unused areas and expired undo areas in the undo, and the undo space cannot be automatically extended, it is a high priority to ensure that the transaction runs smoothly. Rollback areas that do not expire in undo will also be overwritten (starting with the earlier they are used), which means that there is no guarantee for undo non-expired data within the time period set by retention.
After 10g, a new parameter "_ undo_autotune" is introduced, which means that Oracle automatically optimizes undo, that is, in the case of non-automatic growth of undo table space, Oracle will adjust the size of undo RETENTION according to the size of undo table space. Automatically adjust retention is to maximize the use of the current undo table space available space, as much as possible to retain as much undo data as possible, in order to minimize the occurrence of errors such as ORA-01555.
_ undo_autotune this parameter is enabled by default, which is beneficial to long-time queries, but it is not suitable for typical OLTP systems. Especially when the system is busy, there is often insufficient undo. If you recycle undo, a simple resize will report an error. Therefore, it is recommended that you turn off the secondary parameter and set a reasonable retetion time.
SQL > alter system set "_ undo_autotune" = false scope=spfile sid='*'
SQL > alter system set undo_retention=900 scope=spfile sid='*'
(1.OLTP system: 15 minutes / 2. Mixed: 1 hour / 3.DSS system: 3 hours)
7.17 perform sorting in the plan
Normally, we think that the SQL query plan would be a better choice in terms of performance if it uses index to sort the index. But often in the actual production process, in many cases, if you take the order, you may not necessarily get the optimal query plan, but sometimes it will be even worse. So Oracle uses a parameter "_ sort_elimination_cost_ratio" to control whether it is necessary to sort to find the optimal execution plan.
When the ratio of "unsorted cost / sorted cost" is less than the value of the "_ sort_elimination_cost_ratio" parameter, Oracle chooses not to sort to get the execution plan. The default value of "_ sort_elimination_cost_ratio" is 0, which means that Oracle chooses sorting to get the execution plan by default, even if the cost of sorting is infinite.
Practice has proved that the default choice is often wrong and will introduce the query into the wrong execution plan. Therefore, it is recommended to set the parameter value to 5. SQL > alter system set "_ sort_elimination_cost_ratio" = 5 scope=spfile sid='*'
7.18 Control File
With regard to the record retention time control parameter control_file_record_keep_time of the control file, the default value is 7, that is, the control file retains only 7-day records, and backup records longer than 7 days will be considered invalid. The setting of this value needs to be associated with two places:
1. The definition of expiration time in the backup policy set by the centralized backup software.
2. Parameter retention policy of Rman.
For example, our backup policy defines the expiration time of a system backup set as 30 days, then we set the parameters of the control file and RMAN respectively as follows (a long RMAN retention time setting will cause the list backup command to be executed for a few more minutes):
SQL > alter system set control_file_record_keep_time=30 scope=spfile sid='*'
RMAN > configure retention policy to recovery window of 30 days
7.19 job_queue_processes
If there are a large number of Job running at the same time, too small parameter values cause job to have to wait. On the other hand, excessive parameter values consume more system resources. A reasonable value should be set to prevent this from happening.
The default value of Oracle is 1000, which is relatively large for a general OLTP system, so it needs to be modified reasonably.
7.20 rman
1. To improve the efficiency of rman backup, it is recommended to set this parameter to true to start this setting, assuming that the tape backup device supports asynchronous Imax O. Create a large pool to achieve good performance in RMAN. SQL > alter system set backup_tape_io_slaves=true scope=spfile sid='*'
two。 With regard to the RMAN parameter setting of the ADG slave, the parameter RMAN > configure archivelog deletion policy to applied on standby; is set to protect logs that have not been applied from being deleted. It is no longer necessary to set the parameter in the higher version of 11g, but the lower version needs to pay attention.
7.21 other
1. Is the data file of the tablespace expanded automatically?
2. Are all the data files in the tablespace using ASM?
Is the redundancy of 3.ASM consistent?
4. Whether the 180-day limit of the user's default password policy has been lifted, and so on.
5. Are the monitoring metrics of the database covered (cluster, service, monitoring, ASM, tablespace, performance, etc.)?
Is monitoring at the 6.OS level enabled? Especially the communication between private networks, CPU, memory monitoring and so on? Is it Nmon or osw, do their logs cycle regularly or keep growing, and so on?
7. Is the system of database inspection perfect? Is the content of daily inspection and monthly inspection carefully designed? Has automation been realized and so on? It is strongly recommended that daily inspection work be automated, tasks be executed regularly, logs are uniformly integrated into the shared file system, and if possible, they can be integrated into the database and transferred in and out as needed according to their own inspection mechanism and system.
8. Summary and prospect
Based on the summary and refinement of a large number of documents and the practice of the project, this paper summarizes the matters needing attention at all levels in the process of database construction based on the stage of data construction planning, implementation and configuration optimization. I hope to give a reference to those who are engaged in this work and those who will engage in this kind of project in the future, and hope that more people can share their improvement and optimization with you on this basis.
[main references]
[1]。 RAC and Oracle Clusterware best practices and beginner's guide (platform independent section) (document ID 1526083.1)
[2]。 RAC and Oracle Clusterware Best Practices and Starter Kit (Linux) (document ID 811306.1)
[3]。 RAC and Oracle Clusterware Best Practices and Starter Kit (AIX) (document ID 811293.1)
[4]。 Top 11 Things to do NOW to Stabilize your RAC Cluster Environment (document ID 1344678.1)
[5]。 IBM System p Advanced POWER Virtualization Best Practices
[6]。 Best practice: proactively avoid database and query-related performance issues (document ID 1549184.1)
[7]。 11gR2 Clusterware and Grid Home-What You Need to Know (document ID 1053147.1)
[8]。 Oracle Databases on VMware Best Practices Guide
[9]。 Deploying Oracle Database 11g R2 on Red Hat Enterprise Linux 6 Best Practices
[10] .10g and 11gR1 ASM technology best practices (document ID 1602417.1)
[11]。 Oracle ®Database High Availability Best Practices 11g Release 1
[12]。 Oracle ®Database High Availability Best Practices 11g Release 2
[13]。 Oracle on AIX-Configuration & Tuning. R Ballough (ppt)
[14]。 Oracle Database 11g R2 Oracle Database 11g R2 RAC on IBM AIX Tips and Considerations
[15]。 Grid Infrastructure Redundant Interconnect and ora.cluster_interconnect.haip
This article is reprinted in Sohu's article.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.