最近在做一个省的调优时发现该省的数据库总是不定期的出现业务堵塞,而且晚上出现的情况更加严重些。为了分析该情况,部署了一个statspack,每隔10分钟收集一次数据。 发现在发生堵塞的时候,statspack中top 5 event的第一位是enqueue: [crayon-662edba5b9a5 […]
recover时报错ora-289,有意思
oracle的alter database recover语句实在很弱智,明明目录中有arch文件,却不去找这个文件。 看下面的2个例子:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
-- 例子1: sys@ORALOCAL(10.1.26.26)> alter database datafile 'e:\test02.dbf' offline; Database altered. Elapsed: 00:00:02.73 sys@ORALOCAL(10.1.26.26)> sys@ORALOCAL(10.1.26.26)> alter database recover datafile 'e:\test02.dbf'; Database altered. Elapsed: 00:01:00.25 sys@ORALOCAL(10.1.26.26)> alter database datafile 'e:\test02.dbf' online; Database altered. Elapsed: 00:00:00.43 sys@ORALOCAL(10.1.26.26)> |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 |
-- 例子2: sys@ORALOCAL(10.1.26.26)> alter database datafile 'e:\test02.dbf' offline; Database altered. Elapsed: 00:00:00.50 sys@ORALOCAL(10.1.26.26)> alter system switch logfile; System altered. Elapsed: 00:00:01.03 sys@ORALOCAL(10.1.26.26)> / System altered. Elapsed: 00:00:05.14 sys@ORALOCAL(10.1.26.26)> / System altered. Elapsed: 00:00:00.15 sys@ORALOCAL(10.1.26.26)> / System altered. Elapsed: 00:00:05.28 sys@ORALOCAL(10.1.26.26)> / System altered. Elapsed: 00:00:00.15 sys@ORALOCAL(10.1.26.26)> / System altered. Elapsed: 00:00:00.21 sys@ORALOCAL(10.1.26.26)> / System altered. Elapsed: 00:00:00.32 sys@ORALOCAL(10.1.26.26)> / System altered. Elapsed: 00:00:00.35 sys@ORALOCAL(10.1.26.26)> / System altered. Elapsed: 00:00:05.17 sys@ORALOCAL(10.1.26.26)> alter database recover datafile 'e:\test02.dbf'; alter database recover datafile 'e:\test02.dbf' * ERROR at line 1: ORA-00279: change 801109 generated at 12/08/2008 13:21:12 needed for thread 1 ORA-00289: suggestion : D:\ORACLE\PRODUCT\10.2.0\FLASH_RECOVERY_AREA\ORALOCAL\ARCHIVELOG\2008_12_08\O1_MF_1_55_%U_.ARC ORA-00280: change 801109 for thread 1 is in sequence #55 Elapsed: 00:00:00.59 sys@ORALOCAL(10.1.26.26)> sys@ORALOCAL(10.1.26.26)> sys@ORALOCAL(10.1.26.26)> alter database recover cancel; Database altered. Elapsed: 00:00:01.18 sys@ORALOCAL(10.1.26.26)> sys@ORALOCAL(10.1.26.26)> sys@ORALOCAL(10.1.26.26)> alter database recover automatic datafile 'e:\test02.dbf'; Database altered. Elapsed: 00:00:02.29 sys@ORALOCAL(10.1.26.26)> alter database datafile 'e:\test02.dbf' online; Database altered. Elapsed: 00:00:00.39 sys@ORALOCAL(10.1.26.26)> |
可见,不加automatic […]
drop user报错ora-942
今天在一个新建的测试库上drop user的时候,报错ora-942:
1 2 3 4 5 6 7 8 9 10 |
sys@ORALOCAL(10.1.26.26)> drop user HEJIANM cascade; drop user HEJIANM cascade * ERROR at line 1: ORA-00604: error occurred at recursive SQL level 1 ORA-00942: table or view does not exist ORA-06512: at line 11 Elapsed: 00:00:10.50 |
对于942的报错,我们常用的分析方法是设置一个942的errorstack或者10046的trace,在这里,我用了10046的方式来trace这个问题: [cr […]
一次cpu的user比例过高的调优
近期遇一个省的数据库说总是繁忙,idle很小,在凌晨稍微跑几个比较大的统计脚本,idle就出现0的情况,导致业务发生堵塞。 经过连续几天的观察,发现cpu曲线大致如下: 2008年11月18日: 和白鳝交流了一下,老白说cpu的user%比例过高可能是由于buffer get比较高的sql引起。可以 […]
手工安装数据库和创建实例
有时候,我们没有xwindows的界面,需要在silent模式下创建数据库,我们可以用如下的方式创建。 下面的例子是以windows xp,oracle 10201版本。 修改安装目录中的response目录下的custom.rsp文件,可以copy一个后修改。 注意下面的#need change! […]
AQ导致不能drop user cascade
由于工程割接,在新数据库上测试imp一个用户需要多少时间,在把imp的报错慢慢修正后,再一次drop user进行重新导入,却发现不能drop user了:
1 2 3 4 5 6 |
SQL> drop user newsims cascade; drop user newsims cascade * ERROR at line 1: ORA-00604: error occurred at recursive SQL level 1 ORA-24005: must use DBMS_AQADM.DROP_QUEUE_TABLE to drop queue tables |
但是在当前用户下查询user_query_tables会发现没有记录: [c […]
truncate异常中断后,发生DFS LOCK HANDLE
环境: oracle 9206 三节点rac。hp 11.11。 【10月29日,故障发生】 在一次truncate一个大表的过程中(以下操作均在节点1(gdrac1)上操作):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
SQL> select sum(bytes)/1024/1024/1024 size_g from user_segments where segment_name='ALERTBUFF'; SIZE_G ---------- 329.8125 1* select tablespace_name,PARTITIONED from user_tables where table_name='ALERTBUFF' SQL> / TABLESPACE_NAME PAR ------------------------------ --- DATA_DYNAMIC NO SQL> select EXTENT_MANAGEMENT,SEGMENT_SPACE_MANAGEMENT from user_tablespaces where tablespace_name='DATA_DYNAMIC'; EXTENT_MAN SEGMEN ---------- ------ LOCAL MANUAL SQL> select index_name,tablespace_name,PARTITIONED from user_indexes where table_name='ALERTBUFF'; INDEX_NAME TABLESPACE_NAME PAR ------------------------------ ------------------------------ --- IDX_ALERTBUFF_ALERTTIME DATA_DYNAMIC NO PK_ALERTBUFF DATA_DYNAMIC NO |
我们执行了truncate table xxx reu […]
遭遇ora-600[20022]
在近期进行十一数据库巡检的时候,某省的同事发现在数据库的alertlog中有关于ora-600的报错:
1 2 3 4 5 6 7 |
Sun Sep 21 19:20:52 2008 Errors in file /oracle/app/oracle/admin/hbmisc/udump/hbmisc_ora_13668.trc: ORA-00600: internal error code, arguments: [20022], [2], [], [], [], [], [], [] Sun Sep 21 19:21:15 2008 Errors in file /oracle/app/oracle/admin/hbmisc/udump/hbmisc_ora_13072.trc: ORA-00600: internal error code, arguments: [20022], [2], [], [], [], [], [], [] Sun Sep 21 19:38:09 2008 |
进一步查看了trace文件,发现是一个sql有关:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
*** SESSION ID:(172.1955) 2008-09-21 19:20:52.752 *** 2008-09-21 19:20:52.752 ksedmp: internal or fatal error ORA-00600: internal error code, arguments: [20022], [2], [], [], [], [], [], [] Current SQL statement for this session: select count(*) into :b0 from (select icpcode ,icpservid ,a.subsid subsid1 ,to_char(startdate,'YYYYMMDDHH24MISS') startdatetime ,to_char(enddate,'YYYYMMDDHH24MISS') enddatetime ,tab_a.status status1 ,b.subsid subsid2 ,decode(nvl(d.status,'A'),'A',0,2) status2 ,tab_a.visitcount visitcount from tab_a ,tab_b a ,icp ,service ,tab_b b ,tab_c d where (((((((nvl(tab_a.chargemid,tab_a.mid)=a.mid and tab_a.icpid=icp.icpid) and tab_a.servid=service.servid) and tab_a.status<>'C') and (tab_a.mid=:b1 or tab_a.chargemid=:b1)) and tab_a.mid=b.mid) and tab_a.mid=d.mid(+)) and tab_a.servid=d.servid(+)) union select icpcode ,icpservid ,a.subsid subsid1 ,to_char(startdate,'YYYYMMDDHH24MISS') startdatetime ,to_char(startdate,'YYYYMMDDHH24MISS') enddatetime ,'A' status1 ,a.subsid subsid2 ,1 status2 ,tab_a_freeuse.visitcount visitcount from tab_a_freeuse ,tab_b a ,icp ,service where (((tab_a_freeuse.icpid=icp.icpid and tab_a_freeuse.servid=service.servid) and tab_a_freeuse.mid=:b1) and tab_a_freeuse.mid=a.mid)) ----- Call Stack Trace ----- calling call entry argument values in hex location type point (? means dubious value) -------------------- -------- -------------------- ---------------------------- ksedmp()+184 ? ksedst() 40000000017F5B37 ? 000000032 ? 800003FAC0006570 ? 000000000 ? …… |
为什么一个s […]
谨慎的使用shutdown abort
今天在重启一个库的时候,由于等了超过半小时,仍然没有完成数据库的close,于是就用shutdown abort命令关闭数据库。但是在起来的时候,发现在alertlog中有大量的SMON的报错,而且还在持续不断的报错出来。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 |
<--- 正常的启动信息 begin here ---> Mon Sep 1 16:32:02 2008 Starting ORACLE instance (normal) LICENSE_MAX_SESSION = 0 LICENSE_SESSIONS_WARNING = 0 SCN scheme 1 Using log_archive_dest parameter default value LICENSE_MAX_USERS = 0 SYS auditing is disabled Starting up ORACLE RDBMS Version: 9.2.0.6.0. System parameters with non-default values: processes = 1200 timed_statistics = TRUE shared_pool_size = 419430400 sga_max_size = 2108652208 large_pool_size = 117440512 java_pool_size = 117440512 spfile = /dev/vg_ora01/rspfile_128m_01 control_files = /dev/vg_ora01/rctrl_128m_01, /dev/vg_ora02/rctrl_128m_02 db_block_size = 8192 db_cache_size = 1258291200 compatible = 9.2.0.0.0 log_archive_start = TRUE log_archive_dest_1 = location=/arch log_archive_format = arch_%t_%s.arc log_buffer = 10485760 db_files = 800 db_file_multiblock_read_count= 16 fast_start_mttr_target = 300 undo_management = AUTO undo_tablespace = UNDOTBS1 undo_suppress_errors = TRUE undo_retention = 10800 remote_login_passwordfile= EXCLUSIVE db_domain = instance_name = gdmocs job_queue_processes = 6 hash_join_enabled = TRUE background_dump_dest = /oracle/app/oracle/admin/gdmocs/bdump user_dump_dest = /oracle/app/oracle/admin/gdmocs/udump core_dump_dest = /oracle/app/oracle/admin/gdmocs/cdump sort_area_size = 524288 db_name = gdmocs open_cursors = 500 star_transformation_enabled= FALSE query_rewrite_enabled = TRUE pga_aggregate_target = 524288000 PMON started with pid=2 DBW0 started with pid=3 LGWR started with pid=4 CKPT started with pid=5 SMON started with pid=6 RECO started with pid=7 CJQ0 started with pid=8 Mon Sep 1 16:32:03 2008 ARCH: STARTING ARCH PROCESSES ARC0 started with pid=9 ARC0: Archival started ARC1 started with pid=10 ARC1: Archival started Mon Sep 1 16:32:03 2008 ARCH: STARTING ARCH PROCESSES COMPLETE Mon Sep 1 16:32:03 2008 ARC1: Thread not mounted Mon Sep 1 16:32:03 2008 ARC0: Thread not mounted Mon Sep 1 16:32:03 2008 ALTER DATABASE MOUNT Mon Sep 1 16:32:07 2008 Successful mount of redo thread 1, with mount id 2193310019 Mon Sep 1 16:32:07 2008 Database mounted in Exclusive Mode. Completed: ALTER DATABASE MOUNT Mon Sep 1 16:32:07 2008 <--- 正常的启动信息 end here ---> <--- 开始打开数据,发现需要做实例恢复 begin here ---> ALTER DATABASE OPEN Mon Sep 1 16:32:08 2008 Beginning crash recovery of 1 threads Mon Sep 1 16:32:08 2008 Started redo scan Mon Sep 1 16:32:09 2008 Completed redo scan 27274 redo blocks read, 46702 data blocks need recovery Mon Sep 1 16:35:20 2008 <--- 开始打开数据,发现需要做实例恢复 end ---> <--- 开始实例恢复 ---> Started recovery at Thread 1: logseq 24462, block 231548, scn 0.0 Mon Sep 1 16:35:20 2008 Recovery of Online Redo Log: Thread 1 Group 1 Seq 24462 Reading mem 0 Mem# 0 errs 0: /dev/vg_ora01/rredo_256m_01 Mem# 1 errs 0: /dev/vg_ora02/rredo_256m_11 Mon Sep 1 16:35:22 2008 Completed redo application Mon Sep 1 16:35:32 2008 Ended recovery at Thread 1: logseq 24462, block 258822, scn 13.3684259989 46702 data blocks read, 46201 data blocks written, 27274 redo blocks read Crash recovery completed successfully Mon Sep 1 16:35:33 2008 LGWR: Primary database is in CLUSTER CONSISTENT mode Thread 1 advanced to log sequence 24463 Thread 1 opened at log sequence 24463 Current log# 3 seq# 24463 mem# 0: /dev/vg_ora01/rredo_256m_03 Current log# 3 seq# 24463 mem# 1: /dev/vg_ora02/rredo_256m_13 Successful open of redo thread 1 Mon Sep 1 16:35:33 2008 <--- 开始前滚 ---> SMON: enabling cache recovery Mon Sep 1 16:35:33 2008 ARC0: Evaluating archive log 1 thread 1 sequence 24462 ARC0: Beginning to archive log 1 thread 1 sequence 24462 Creating archive destination LOG_ARCHIVE_DEST_1: '/arch/arch_1_24462.arc' Mon Sep 1 16:35:34 2008 Successfully onlined Undo Tablespace 1. Mon Sep 1 16:35:34 2008 <--- 开始回滚 ---> SMON: enabling tx recovery Mon Sep 1 16:35:34 2008 Database Characterset is ZHS16GBK Mon Sep 1 16:35:34 2008 <--- 开始出现大量的SMON报错 ---> SMON: about to recover undo segment 60 SMON: mark undo segment 60 as available SMON: about to recover undo segment 60 SMON: mark undo segment 60 as available SMON: about to recover undo segment 60 SMON: mark undo segment 60 as available SMON: about to recover undo segment 60 SMON: mark undo segment 60 as available SMON: about to recover undo segment 60 SMON: mark undo segment 60 as available Mon Sep 1 16:35:35 2008 replication_dependency_tracking turned off (no async multimaster replication found) Mon Sep 1 16:35:35 2008 SMON: about to recover undo segment 60 SMON: mark undo segment 60 as available SMON: about to recover undo segment 60 SMON: mark undo segment 60 as available SMON: about to recover undo segment 60 SMON: mark undo segment 60 as available <--- 实例恢复完成,数据库open ---> Mon Sep 1 16:35:35 2008 Completed: ALTER DATABASE OPEN Mon Sep 1 16:35:35 2008 SMON: about to recover undo segment 60 SMON: mark undo segment 60 as available SMON: about to recover undo segment 60 SMON: mark undo segment 60 as available SMON: about to recover undo segment 60 SMON: mark undo segment 60 as available SMON: about to recover undo segment 60 SMON: mark undo segment 60 as available SMON: about to recover undo segment 60 SMON: mark undo segment 60 as available SMON: about to recover undo segment 60 SMON: mark undo segment 60 as available SMON: about to recover undo segment 60 SMON: mark undo segment 60 as available SMON: about to recover undo segment 60 SMON: mark undo segment 60 as available SMON: about to recover undo segment 60 SMON: mark undo segment 60 as available SMON: about to recover undo segment 60 SMON: mark undo segment 60 as available SMON: about to recover undo segment 60 SMON: mark undo segment 60 as available SMON: about to recover undo segment 60 SMON: mark undo segment 60 as available SMON: about to recover undo segment 60 SMON: mark undo segment 60 as available SMON: about to recover undo segment 60 SMON: mark undo segment 60 as available SMON: about to recover undo segment 60 SMON: mark undo segment 60 as available |
此时,数据 […]
不要删除你所有的归档日志
今天遇到一个问题,同事来说归档目录已经全部删除了,但是归档目录的大小还是100%,且执行shutdown immediate挂死。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
$ bdf Filesystem kbytes used avail %used Mounted on /dev/vg00/lvol3 262144 181520 80040 69% / /dev/vg00/lvol1 314736 84360 198896 30% /stand /dev/vg00/lvol8 4718592 1056280 3633752 23% /var /dev/vg00/lvol7 2654208 1435160 1209576 54% /usr /dev/vg00/lvol4 1540096 1028512 508264 67% /tmp /dev/vg00/lvol9 10485760 4948670 5365096 48% /oracle /dev/vg00/lvol6 94208000 7317328 86211936 8% /opt /dev/vg00/lvol5 163840 2480 160104 2% /home /dev/vg00/lv_arch 51216384 51112984 102632 100% /arch $ cd /arch $ ll total 0 drwxr-xr-x 2 oracle dba 96 Dec 18 2007 lost+found $ |
同时alertlog中也在不断的报错:
1 2 3 4 5 |
ORA-16014: log 2 sequence# 28801 not archived, no available destinations ORA-00312: online log 2 thread 1: '/dev/vg_audit01/rredo_128m_02' ORA-00312: online log 2 thread 1: '/dev/vg_audit01/rredo_128m_12' ORA-00312: online log 2 thread 1: '/dev/vg_audit01/rredo_128m_22' Mon Sep 1 10:28:55 2008 |
HP中无法使用@符号
在一次测试中,发现在oracle用户下无法使用@符号,测试用tnsnames登录的时候,无法识别@符号:
1 2 |
$ sqlplus user/pwd@hnadt sh: hnadt: not found. |
但是tnsping却能ping通的,检查tnsnames.ora文件也没问题,同时,用vi打开profile时,也不能添加使用@ […]
开启异步IO的相关风险
由于IO的请求不是每次等待完成指令后再发送下一个请求,而是存在于队列中,且遵循FIFO。因此如果遇到存储掉电的情况,就可能会出现数据的不一致。虽然这种情况出现的可能性不大,因为存储中有电池,能保证cache中的信息写到存储中。但是在这里还是提一下数据丢失的风险。 由于我们的异步IO的队列中是针对使用 […]