Analysis of the reasons why online redo log has been in the state of active 04/26 Update SLTechnology News&Howtos

Analysis of the reasons why online redo log has been in the state of active

2025-04-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article will explain in detail the reasons why online redo log has been in the state of active. The content of the article is of high quality, so the editor will share it with you for reference. I hope you will have a certain understanding of the relevant knowledge after reading this article.

Platform: SunOS 5.8 Generic_108528-23 sun4u sparc SUNW,Ultra-Enterprise

Database: 8.1.5.0.0

Symptom: slow response, application request has been unable to return

Check whether the redo log size and the number of groups are reasonable.

Oracle:/oracle/oracle8 > sqlplus "/ as sysdba"

SQL*Plus: Release 8.1.5.0.0-Production on Thu Jun 23 18:56:06 2005

Connected to:

Oracle8i Enterprise Edition Release 8.1.5.0.0-Production

With the Partitioning and Java options

PL/SQL Release 8.1.5.0.0-Production

SQL > select * from v$log

GROUP# THREAD# SEQUENCE# BYTES MEMBERS ARC STATUS FIRST_CHANGE# FIRST_TIM

1 1 520403 31457280 1 NO ACTIVE 1.3861E+10 23-JUN-05

2 1 520404 31457280 1 NO ACTIVE 1.3861E+10 23-JUN-05

31 520405 31457280 1 NO ACTIVE 1.3861E+10 23-JUN-05

4 1 520406 31457280 1 NO CURRENT 1.3861E+10 23-JUN-05

5 1 520398 31457280 1 NO ACTIVE 1.3860E+10 23-JUN-05

6 1 520399 31457280 1 NO ACTIVE 1.3860E+10 23-JUN-05

7 1 520400 104857600 1 NO ACTIVE 1.3860E+10 23-JUN-05

8 1 520401 104857600 1 NO ACTIVE 1.3860E+10 23-JUN-05

9 1 520402 104857600 1 NO ACTIVE 1.3861E+10 23-JUN-05

9 rows selected.

SQL > /

GROUP# THREAD# SEQUENCE# BYTES MEMBERS ARC STATUS FIRST_CHANGE# FIRST_TIM

1 1 520403 31457280 1 NO ACTIVE 1.3861E+10 23-JUN-05

2 1 520404 31457280 1 NO ACTIVE 1.3861E+10 23-JUN-05

31 520405 31457280 1 NO ACTIVE 1.3861E+10 23-JUN-05

4 1 520406 31457280 1 NO CURRENT 1.3861E+10 23-JUN-05

5 1 520398 31457280 1 NO ACTIVE 1.3860E+10 23-JUN-05

6 1 520399 31457280 1 NO ACTIVE 1.3860E+10 23-JUN-05

7 1 520400 104857600 1 NO ACTIVE 1.3860E+10 23-JUN-05

8 1 520401 104857600 1 NO ACTIVE 1.3860E+10 23-JUN-05

9 1 520402 104857600 1 NO ACTIVE 1.3861E+10 23-JUN-05

9 rows selected.

If the logs are in the active state, it is obvious that DBWR writes can no longer keep up with the checkpoints triggered by log switch.

Next, let's check how busy DBWR is:

SQL >!

Oracle:/oracle/oracle8 > ps-ef | grep ora_

Oracle 2273 1 0 Mar 31? 57:40 ora_smon_hysms02 oracle 2266 1 0 Mar 31? 811:42 ora_dbw0_hysms02

Oracle 2264 1 16 Mar 31? 16999:57 ora_pmon_hysms02

Oracle 2268 1 0 Mar 31? 1649:07 ora_lgwr_hysms02

Oracle 2279 1 0 Mar 31? 8:09 ora_snp1_hysms02

Oracle 2281 1 0 Mar 31? 4:22 ora_snp2_hysms02

Oracle 2285 1 0 Mar 31? 9:40 ora_snp4_hysms02

Oracle 2271 1 0 Mar 31? 15:57 ora_ckpt_hysms02

Oracle 2283 1 0 Mar 31? 5:37 ora_snp3_hysms02

Oracle 2277 1 0 Mar 31? 5:58 ora_snp0_hysms02

Oracle 2289 1 0 Mar 31? 0:00 ora_d000_hysms02

Oracle 2287 1 0 Mar 31? 0:00 ora_s000_hysms02

Oracle 2275 1 0 Mar 31? 0:04 ora_reco_hysms02

Oracle 21023 21012 0 18:52:59 pts/65 0:00 grep ora_

The process number of DBWR is 2266.

Use the Top command to observe:

Oracle:/oracle/oracle8 > top

Last pid: 21145; load averages: 3.38,3.45,3.67 18:53:38

725 processes: 711 sleeping, 1 running, 10 zombie, 3 on cpu

CPU states: 35.2% idle, 40.1% user, 9.4% kernel, 15.4% iowait, 0.0% swap

Memory: 3072M real, 286M free, 3120M swap in use, 1146M swap free

PID USERNAME THR PRI NICE SIZE RES STATE TIME CPU COMMAND

11855 smspf 1 59 0 1355M 1321M cpu/0 19:32 16.52% oracle

2264 oracle 100 1358M 1316M run 283.3H 16.36% oracle

11280 oracle 1 13 0 1356M 1321M sleep 79.8H 0.77% oracle

6957 smspf 15 29 10 63M 14m sleep 107.7H 0.76% java

17393 smspf 1 30 0 1356 M 1322M cpu/1 833 oracle 05 0.58% oracle

29299 smspf 5 58 08688K 5088K sleep 18.5H 0.38% fee_ftp_get

21043 oracle 1 43 0 3264K 2056K cpu/9 0:01 0.31% top

20919 smspf 17 29 10 63M 17M sleep 247 java 02 0.29%

25124 smspf 1 58 0 16M 4688K sleep 0:35 0.25% smif_status_rec

8086 smspf 5 23 0 21M 13m sleep 41.1H 0.24% fee_file_in

16009 root 1 35 0 4920K 3160K sleep 0:03 0.21% sshd2

25126 smspf 1 58 0 1355M 1321M sleep 0:26 0.20% oracle 2266 oracle 1 600 1357M 1317M sleep 811 smspf 42 0.18% oracle

11628 smspf 7 59 0 3440K 2088K sleep 0:39 0.16% sgip_client_ltz

26257 smspf 82 59 0 447M 178M sleep 533 java 04 0.15% java

We note that process 2266 consumes only 0.18% of CPU, which is obviously not busy, so the bottleneck is likely to be on IO.

Use the IOSTAT tool to check the IO condition.

Gqgai:/home/gqgai > iostat-xn 3

Extended device statistics

R device s w device s kr/s kw/s wait actv wsvc_t asvc_t

0.0 0.0 0.0 0 0 c0t6d0

1.8 38.4 32.4 281.0 0.0 0.7 0.0 16.4 0 29 c0t10d0

1.8 38.4 32.4 281.0 0.0 0.5 0.0 13.5 0 27 c0t11d0

24.8 61.3 1432.4 880.1 0.0 0.5 0.0 5.4 0 26 c1t1d0

0.0 0.0 0.0 9.1 0 hurraysms02:vold (pid238)

Extended device statistics

R device s w device s kr/s kw/s wait actv wsvc_t asvc_t

0.0 0.0 0.0 0 0 c0t6d0

0.3 8.3 0.3 47.0 0.0 0.1 0.0 9.2 0 8 c0t10d0

0.0 8.3 0.0 47.0 0.0 0.1 0.0 8.00 7 c0t11d0 11.7 65.3 197.2 522.2 0.0 1.6 0.0 20.5 0 100 c1t1d00.0 0.0 0.0 0.0 00 hurraysms02:vold (pid238)

Extended device statistics

R device s w device s kr/s kw/s wait actv wsvc_t asvc_t

0.0 0.0 0.0 0 0 c0t6d0

0.3 13.7 2.7 68.2 0.0 0.2 0.0 10.9 0 12 c0t10d0

0.0 13.7 0.0 68.2 0.0 0.1 0.0 9.60 0 11 c0t11d0 11.3 65.3 90.7 522.7 0.0 1.5 0.0 19.50 0 99 c1t1d00.0 0.0 0.0 0.0 00 hurraysms02:vold (pid238)

Extended device statistics

R device s w device s kr/s kw/s wait actv wsvc_t asvc_t

0.0 0.0 0.0 0 0 c0t6d0

0.0 8.0 0.0 42.7 0.0 0.1 0.0 9.3 0 7 c0t10d0

0.0 8.0 0.0 42.7 0.0 0.1 0.0 9.10 7 c0t11d0 11.065.7 978.7 525.3 0.01.4 17.7 0 99 c1t1d00.0 0.0 0.0 0.0 00 hurraysms02:vold (pid238)

Extended device statistics

R device s w device s kr/s kw/s wait actv wsvc_t asvc_t

0.0 0.0 0.0 0 0 c0t6d0

0.3 87.7 2.7 433.7 0.0 2.2 0.0 24.9 0 90 c0t10d0

0.0 88.3 0.0 436.5 0.0 1.8 0.0 19.9 081 c0t11d0 89.054.0 725.4 432.0 0.02.0.0 14.8 0100 c1t1d00.0 0.0 0.0 0.0 00 hurraysms02:vold (pid238)

We noticed that the main volume of the database, c1t1d0, was always busy at 99 percent 100, while the write speed was only about 500K/s, which was extremely slow.

(% b percent of time the disk is busy (transactions in progress)

Kw/s kilobytes written per second)

According to our common sense T3 disk array usually according to Char write speed can reach 10M/s or so, previously tested some Tpcc indicators, you can refer to: Use bonnie to Test system IO speed.

Under normal circumstances, the random write of the database is usually about 1cm. Obviously, the disk is already in an abnormal state. It is confirmed that the hard disk has been damaged, and a hard disk has been damaged in the Group of Raid5.

After replacement, the system gradually returned to normal.

This is the end of the analysis of the reasons why online redo log has been in the state of active. I hope the above content can be helpful to you and learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.