在oracle11g中,dump file的目录已经有所改变,bdump和udump整合到trace中,cdump独立出一个。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
E:\ora11g\app\Administrator\diag\rdbms>tree 卷 DOCUMENT 的文件夹 PATH 列表 卷序列号为 5A21-A80E E:. └─ora11g └─ora11g ├─alert ├─cdump ├─hm ├─incident ├─incpkg ├─ir ├─lck ├─metadata ├─stage ├─sweep └─trace E:\ora11g\app\Administrator\diag\rdbms> |
而oracle也提供了一个ADRCI(Automatic Diagnostic Repository Command Interpreter)的工具来提供这些路径下各个trace、incident、problem的管理。下面我们来看看该命令的使用。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
C:\Documents and Settings\Administrator>adrci ADRCI: Release 11.2.0.1.0 - Production on Fri Jun 22 23:05:42 2012 Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved. ADR base = "e:\ora11g\app\administrator" adrci> help HELP [topic] Available Topics: CREATE REPORT ECHO EXIT HELP HOST IPS PURGE RUN SET BASE SET BROWSER SET CONTROL SET ECHO SET EDITOR SET HOMES | HOME | HOMEPATH SET TERMOUT SHOW ALERT SHOW BASE SHOW CONTROL SHOW HM_RUN SHOW HOMES | HOME | HOMEPATH SHOW INCDIR SHOW INCIDENT SHOW PROBLEM SHOW REPORT SHOW TRACEFILE SPOOL There are other commands intended to be used directly by Oracle, type "HELP EXTENDED" to see the list adrci> |
在使用前,我们要先设定一下adrci的主目录:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
adrci> show home ADR Homes: diag\clients\user_administrator\host_2188829984_76 diag\clients\user_unknown\host_2188829984_76 diag\diagtool\user_administrator\host_2188829984_76 diag\rdbms\ora11g\ora11g diag\tnslsnr\hejianmin\listener adrci> adrci> adrci> set home diag\rdbms\ora11g\ora11g adrci> adrci> show home ADR Homes: diag\rdbms\ora11g\ora11g adrci> |
好了,设置完主目录后,我们先来检查一下我们的alertlog:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
adrci> show alert -tail -f 2012-06-22 23:51:45.466000 +08:00 Thread 1 cannot allocate new log, sequence 48 Checkpoint not complete Current log# 2 seq# 47 mem# 0: E:\ORA11G\APP\ADMINISTRATOR\ORADATA\ORA11G\REDO02.LOG 2012-06-22 23:51:46.981000 +08:00 Thread 1 advanced to log sequence 48 (LGWR switch) Current log# 3 seq# 48 mem# 0: E:\ORA11G\APP\ADMINISTRATOR\ORADATA\ORA11G\REDO03.LOG Thread 1 cannot allocate new log, sequence 49 Checkpoint not complete Current log# 3 seq# 48 mem# 0: E:\ORA11G\APP\ADMINISTRATOR\ORADATA\ORA11G\REDO03.LOG 2012-06-22 23:51:49.934000 +08:00 Thread 1 advanced to log sequence 49 (LGWR switch) Current log# 1 seq# 49 mem# 0: E:\ORA11G\APP\ADMINISTRATOR\ORADATA\ORA11G\REDO01.LOG 2012-06-22 23:52:05.653000 +08:00 Thread 1 advanced to log sequence 50 (LGWR switch) Current log# 2 seq# 50 mem# 0: E:\ORA11G\APP\ADMINISTRATOR\ORADATA\ORA11G\REDO02.LOG Thread 1 advanced to log sequence 51 (LGWR switch) Current log# 3 seq# 51 mem# 0: E:\ORA11G\APP\ADMINISTRATOR\ORADATA\ORA11G\REDO03.LOG 2012-06-22 23:52:07.122000 +08:00 Thread 1 advanced to log sequence 52 (LGWR switch) Current log# 1 seq# 52 mem# 0: E:\ORA11G\APP\ADMINISTRATOR\ORADATA\ORA11G\REDO01.LOG |
类似的,我们可以
1 2 3 4 5 |
show alert -tail (默认是10条条目) show alert -tail -20(显示最后20个条目) show alert -tail -f(类似tail -f alertlog一样) show alert(windows中会弹出txt文本窗口显示,如已经装了MKS Toolkit,希望用vi显示,则需要先set editor vi,再show alert。) show alert -p "MESSAGE_TEXT like '%ORA-%'"(显示alert中关于ORA-报错的内容。其他更多的show alert的选项,可见 help show alert) |
注意,show alert的起始点是在control中设置的保留时间之后的日志,或者说是purge之后的日志,之前的日志无法显示。
1 2 3 4 5 6 7 8 9 10 |
adrci> show control ADR Home = e:\ora11g\app\administrator\diag\rdbms\ora11g\ora11g: ************************************************************************* ADRID SHORTP_POLICY LONGP_POLICY LAST_MOD_TIME LAST_AUTOPRG_TIME LAST_MANUPRG_TIME ADRDIR_VERSION ADRSCHM_VERSION ADRSCHMV_SUMMARY ADRALERT_VERSION CREATE_TIME -------------------- -------------------- -------------------- ---------------------------------------- ---------------------------------------- ---------------------------------------- -------------------- -------------------- -------------------- -------------------- ---------------------------------------- 799124850 720 8760 2011-05-08 17:12:01.015000 +08:00 2011-07-08 23:00:22.380000 +08:00 2012-06-22 23:36:14.731000 +08:00 1 2 76 1 2011-05-08 17:12:01.015000 +08:00 1 rows fetched adrci> |
也就是max(LAST_AUTOPRG_TIME,LAST_MANUPRG_TIME)之后的日志。
关于control的设置,除了默认的SHORTP_POLICY为720小时,LONGP_POLICY为8760小时外,我们可以修改这2个时间:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
adrci> set control (SHORTP_POLICY = 360) adrci> set control (LONGP_POLICY = 2160) adrci> adrci> show control ADR Home = e:\ora11g\app\administrator\diag\rdbms\ora11g\ora11g: ************************************************************************* ADRID SHORTP_POLICY LONGP_POLICY LAST_MOD_TIME LAST_AUTOPRG_TIME LAST_MANUPRG_TIME ADRDIR_VERSION ADRSCHM_VERSION ADRSCHMV_SUMMARY ADRALERT_VERSION CREATE_TIME -------------------- -------------------- -------------------- ---------------------------------------- ---------------------------------------- ---------------------------------------- -------------------- -------------------- -------------------- -------------------- ---------------------------------------- 799124850 360 2160 2012-06-23 00:11:47.106000 +08:00 2011-07-08 23:00:22.380000 +08:00 2012-06-22 23:36:14.731000 +08:00 1 2 76 1 2011-05-08 17:12:01.015000 +08:00 1 rows fetched adrci> |
这样就改成了tracefiles保留360小时,即15天,incident file保留2160小时,即90天。
上述就是oracle自己保留日志的期限,但是我们也可以通过purge这个命令来手工的清空。关于purge这个命令,我会在介绍incident和problem之后,再介绍。
下面我们来看看关于incident和proble的显示。我们人为的制造一个incident
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 |
adrci> show incident ADR Home = e:\ora11g\app\administrator\diag\rdbms\ora11g\ora11g: ************************************************************************* 0 rows fetched adrci> adrci> adrci> adrci> show incident ADR Home = e:\ora11g\app\administrator\diag\rdbms\ora11g\ora11g: ************************************************************************* 0 rows fetched adrci> sys@ORA11G(192.168.1.106)> select OBJECT_NAME,OBJECT_ID,DATA_OBJECT_ID from dba_objects where owner='TEST' and object_name='TB_TEST'; OBJECT_NAME OBJECT_ID DATA_OBJECT_ID ------------------------------ ---------- -------------- TB_TEST 74270 74270 Elapsed: 00:00:00.00 sys@ORA11G(192.168.1.106)> sys@ORA11G(192.168.1.106)> sys@ORA11G(192.168.1.106)> update sys.tab$ set OBJ# = 999999 where OBJ# = 74270; 1 row updated. Elapsed: 00:00:00.00 sys@ORA11G(192.168.1.106)> commit; Commit complete. Elapsed: 00:00:00.03 sys@ORA11G(192.168.1.106)> alter system checkpoint; System altered. Elapsed: 00:00:00.26 sys@ORA11G(192.168.1.106)> sys@ORA11G(192.168.1.106)> sys@ORA11G(192.168.1.106)> shutdown immediate Database closed. Database dismounted. ORACLE instance shut down. sys@ORA11G(192.168.1.106)> startup ORACLE instance started. Total System Global Area 778387456 bytes Fixed Size 1374808 bytes Variable Size 310379944 bytes Database Buffers 461373440 bytes Redo Buffers 5259264 bytes Database mounted. Database opened. sys@ORA11G(192.168.1.106)>conn test/test test@ORA11G(192.168.1.106)> desc tb_test ERROR: ORA-03113: end-of-file on communication channel Process ID: 640 Session ID: 137 Serial number: 2 test@ORA11G(192.168.1.106)> test@ORA11G(192.168.1.106)> |
我们来看看show incident和show problem:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
adrci> show incident ADR Home = e:\ora11g\app\administrator\diag\rdbms\ora11g\ora11g: ************************************************************************* INCIDENT_ID PROBLEM_KEY CREATE_TIME -------------------- ----------------------------------------------------------- ---------------------------------------- 32620 ORA 7445 [kqldcdp()+78] 2012-06-24 22:13:56.109000 +08:00 1 rows fetched adrci> show problem ADR Home = e:\ora11g\app\administrator\diag\rdbms\ora11g\ora11g: ************************************************************************* PROBLEM_ID PROBLEM_KEY LAST_INCIDENT LASTINC_TIME -------------------- ----------------------------------------------------------- -------------------- ---------------------------------------- 2 ORA 7445 [kqldcdp()+78] 32620 2012-06-24 22:13:56.109000 +08:00 1 rows fetched adrci> |
我们看到分别有一个incident和problem。
我们再次desc一次这个表:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
sys@ORA11G(192.168.1.106)> conn test/test Connected. test@ORA11G(192.168.1.106)> test@ORA11G(192.168.1.106)> test@ORA11G(192.168.1.106)> desc tb_test ERROR: ORA-03113: end-of-file on communication channel Process ID: 7244 Session ID: 16 Serial number: 24 test@ORA11G(192.168.1.106)> adrci> show incident ADR Home = e:\ora11g\app\administrator\diag\rdbms\ora11g\ora11g: ************************************************************************* INCIDENT_ID PROBLEM_KEY CREATE_TIME -------------------- ----------------------------------------------------------- ---------------------------------------- 32620 ORA 7445 [kqldcdp()+78] 2012-06-24 22:13:56.109000 +08:00 32580 ORA 7445 [kqldcdp()+78] 2012-06-24 22:17:23.890000 +08:00 2 rows fetched adrci> show problem ADR Home = e:\ora11g\app\administrator\diag\rdbms\ora11g\ora11g: ************************************************************************* PROBLEM_ID PROBLEM_KEY LAST_INCIDENT LASTINC_TIME -------------------- ----------------------------------------------------------- -------------------- ---------------------------------------- 2 ORA 7445 [kqldcdp()+78] 32580 2012-06-24 22:17:23.890000 +08:00 1 rows fetched adrci> |
我们看到有2个incident和一个problem。我们可以理解成problem是重复的incident。
注意在这里incident和problem还可以用如下的用法:
1 2 |
show incident -mode detail -p "incident_id=123" show problem -p "problem_id=2" |
更多信息见help show incident或help show problem。
OK,我们现在可以将相关的incident的文件打包给oracle,上传到SR上。我们这里用到一个IPS(incident package service)服务。
我们先创建一个逻辑的package:
比如我们针对上面的2个incident和1个problem,我们可以有各种选择:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
1.什么参数都不带,表示以默认的level typical, 里面无任何内容。 adrci> ips create package Created package 1 without any contents, correlation level typical 2.加incident参数,指明某个incident。默认level还是typical。 adrci> adrci> ips create package incident 32620 Created package 2 based on incident id 32620, correlation level typical adrci> 3.加incident参数,指定某个incident,设置level为all。 adrci> ips create package incident 32580 correlate all Created package 3 based on incident id 32580, correlation level all adrci> 4.加problem参数,指定某个problem。默认level还是typical。 adrci> ips create package problem 2 Created package 4 based on problem id 2, correlation level typical 5.加problem参数,指定某个problem,设置level为basic。 adrci> adrci> ips create package problem 2 correlate basic Created package 5 based on problem id 2, correlation level basic adrci> adrci> 6.加time参数,设置开始时间to结束时间,后面的+8:00表示东八区。 adrci> ips create package time '2012-06-23 00:00:00 +08:00' to '2012-06-24 23.00.00 +08:00' Created package 6 based on time range 2012-06-23 00:00:00.000000 +08:00 to 2012-06-24 23:00:00.000000 +08:00, correlation level typical adrci> |
我们看到,已经有6个逻辑上的package被建立,在incpkg也有了6个对于的目录。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
E:\ora11g\app\Administrator\diag\rdbms>tree 卷 DOCUMENT 的文件夹 PATH 列表 卷序列号为 5A21-A80E E:. └─ora11g └─ora11g ├─alert ├─cdump ├─hm ├─incident │ ├─incdir_32580 │ └─incdir_32620 ├─incpkg │ ├─pkg_1 │ │ └─seq_1 │ ├─pkg_2 │ │ └─seq_1 │ ├─pkg_3 │ │ └─seq_1 │ ├─pkg_4 │ │ └─seq_1 │ ├─pkg_5 │ │ └─seq_1 │ └─pkg_6 │ └─seq_1 ├─ir ├─lck ├─metadata ├─stage ├─sweep └─trace ├─cdmp_20120624221402 └─cdmp_20120624221725 E:\ora11g\app\Administrator\diag\rdbms> |
我们上传给oracle的话,还需要建立真正物理上的package。
1 2 3 |
adrci> ips GENERATE PACKAGE 2 Generated package 2 in file E:\ORA7445kq_20120624223659_COM_1.zip, mode complete adrci> |
如果我们要自定义路径,加in参数即可:
1 2 3 |
adrci> ips GENERATE PACKAGE 3 in E:\ora11g\app\Administrator\diag\rdbms\ora11g\ora11g\jimmy_ips Generated package 3 in file E:\ora11g\app\Administrator\diag\rdbms\ora11g\ora11g\jimmy_ips\ORA7445kq_20120624223726_COM_2.zip, mode complete adrci> |
注意ips的调用和perl有关。如果你的机器上装有多个版本的oracle,请注意环境变量的中的perl的路径。不然会有类似的报错:
1 2 3 4 5 6 7 8 |
adrci> ips GENERATE PACKAGE 5 in E:\ora11g\app\Administrator\diag\rdbms\ora11g\ora11g\jimmy_ips Perl 5.006 required--this is only version 5.00503, stopped at E:\ora10g\oracle\product\10.2.0\db_1\perl\5.8.3\lib/vars.pm line 3. BEGIN failed--compilation aborted at E:\ora10g\oracle\product\10.2.0\db_1\perl\5.8.3\lib/vars.pm line 3. BEGIN failed--compilation aborted at E:\ora10g\oracle\product\10.2.0\db_1\perl\site\5.8.3\lib/Win32/TieRegistry.pm line 14. Generated package 5 in file E:\ora11g\app\Administrator\diag\rdbms\ora11g\ora11g\jimmy_ips\ORA7445kq_20120624223801_COM_2.zip, mode complete DIA-49441: Warnings while finalizing package, details in file e:\ora11g\app\administrator\diag\rdbms\ora11g\ora11g\incpkg\pkg_5\seq_2\finalize.log adrci> |
这样就可以把zip发给oracle诊断了。
好了,最好,我们来打扫战场。
(1)清除incident,我们可以用purge命令:
1 2 3 4 5 6 7 8 |
adrci> purge -age 2 adrci> show incident ADR Home = e:\ora11g\app\administrator\diag\rdbms\ora11g\ora11g: ************************************************************************* 0 rows fetched adrci> |
(2)清除problem,我们可以用delete命令:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
adrci> show problem ADR Home = e:\ora11g\app\administrator\diag\rdbms\ora11g\ora11g: ************************************************************************* PROBLEM_ID PROBLEM_KEY LAST_INCIDENT LASTINC_TIME ——————– ———————————————————– ——————– ————————– 1 ORA 7445 [kqrprl()+92] 22970 2012-04-20 13:42:14.953000 1 rows fetched adrci> adrci> adrci> adrci> delete from problem where problem_id=1 1 Row Deleted adrci> adrci> adrci> show problem ADR Home = e:\ora11g\app\administrator\diag\rdbms\ora11g\ora11g: ************************************************************************* 0 rows fetched adrci> adrci> |
本来还想多写点,到后来猛然发现原来这货是为了给oracle开SR的时候,打包各类日志提供方便。这货的主要目的还是为他人做嫁衣裳了。
DBA的关键素质还是的自己能看trace,分析trace,不能把所有的希望寄托在metalink上。