今天在建表空间的时候,数据库不知道被谁停了,在建的时候报oracle not avaliable,等再次确认没有“雷锋”帮我shutdown数据库之后,startup了数据库。启动时,没有报错,提示数据库已经mount、已经open。
可是,再次重建表空间的时候,却报错了数据文件已经存在:
1 2 3 4 5 6 |
SQL> create tablespace ts_data_r datafile '/opt/oradata/dfile/ts_data_r_001g_001.dbf' reuse 2 extent management local segment space management auto uniform size 1024k; create tablespace ts_data_r datafile '/opt/oradata/dfile/ts_data_r_001g_001.dbf' reuse * ERROR at line 1: ORA-01537: cannot add data file '/opt/oradata/dfile/ts_data_r_001g_001.dbf' - file already part of database |
检查了一下这个数据文件,发现在v$datafile中能找到,且状态是offine状态,但是在dba_data_files中找不到:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
SQL> select TS#,status,name from v$datafile; TS# STATUS NAME ---------- ------- --------------------------------------------------------- 0 SYSTEM /opt/oradata/dfile/system01.dbf 1 ONLINE /opt/oradata/dfile/undotbs01.dbf 3 ONLINE /opt/oradata/dfile/indx01.dbf 4 ONLINE /opt/oradata/dfile/tools01.dbf 5 ONLINE /opt/oradata/dfile/users01.dbf 6 OFFLINE /opt/oradata/dfile/ts_data_r_001g_001.dbf 6 rows selected. SQL> select tablespace_name ,file_name from dba_data_files; TABLESPACE_NAME FILE_NAME ------------------------------ -------------------------------------------------------- SYSTEM /opt/oradata/dfile/system01.dbf UNDOTBS1 /opt/oradata/dfile/undotbs01.dbf INDX /opt/oradata/dfile/indx01.dbf TOOLS /opt/oradata/dfile/tools01.dbf USERS /opt/oradata/dfile/users01.dbf SQL> |
为了能新(重)建表空间,尝试把表空间drop,但是报错表空间已经不存在:
1 2 3 4 5 |
SQL> drop tablespace TS_DATA_R including CONTENTS and datafiles; drop tablespace TS_DATA_R including CONTENTS and datafiles * ERROR at line 1: ORA-00959: tablespace 'TS_DATA_R' does not exist |
那只能先offline drop 数据文件了(数据库在非归档模式):
1 2 3 4 5 |
SQL> alter database datafile '/opt/oradata/dfile/ts_data_r_001g_001.dbf'offline drop; Database altered. SQL> |
ok!!在此以为总大功告成了吧,来我们重建表空间!可是接下来的事情却让人那么的沮丧:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
SQL> create tablespace ts_data_r datafile '/opt/oradata/dfile/ts_data_r_001g_001.dbf' reuse 2 extent management local segment space management auto uniform size 1024k; create tablespace ts_data_r datafile '/opt/oradata/dfile/ts_data_r_001g_001.dbf' reuse * ERROR at line 1: ORA-01537: cannot add data file '/opt/oradata/dfile/ts_data_r_001g_001.dbf' - file already part of database SQL> drop tablespace TS_DATA_R including CONTENTS and datafiles; drop tablespace TS_DATA_R including CONTENTS and datafiles * ERROR at line 1: ORA-00959: tablespace 'TS_DATA_R' does not exist SQL> select TS#,status,name from v$datafile; TS# STATUS NAME ---------- ------- ----------------------------------- 0 SYSTEM /opt/oradata/dfile/system01.dbf 1 ONLINE /opt/oradata/dfile/undotbs01.dbf 3 ONLINE /opt/oradata/dfile/indx01.dbf 4 ONLINE /opt/oradata/dfile/tools01.dbf 5 ONLINE /opt/oradata/dfile/users01.dbf 6 OFFLINE /opt/oradata/dfile/ts_data_r_001g_001.dbf 6 rows selected. SQL> alter database datafile '/opt/oradata/dfile/ts_data_r_001g_001.dbf'offline drop; Database altered. SQL> select TS#,status,name from v$datafile; TS# STATUS NAME ---------- ------- ------------------------- 0 SYSTEM /opt/oradata/dfile/system01.dbf 1 ONLINE /opt/oradata/dfile/undotbs01.dbf 3 ONLINE /opt/oradata/dfile/indx01.dbf 4 ONLINE /opt/oradata/dfile/tools01.dbf 5 ONLINE /opt/oradata/dfile/users01.dbf 6 OFFLINE /opt/oradata/dfile/ts_data_r_001g_001.dbf 6 rows selected. SQL> drop tablespace TS_DATA_R; drop tablespace TS_DATA_R * ERROR at line 1: ORA-00959: tablespace 'TS_DATA_R' does not exist |
由于ts_data_r_001g_001.dbf文件一直存在,无法offline drop,无论怎么操作,都能在v$datafile中看到它,那么我索性就干脆先把它online在尝试offline drop吧:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 |
SQL> alter database datafile '/opt/oradata/dfile/ts_data_r_001g_001.dbf' online; Database altered. SQL> select TS#,status,name from v$datafile; TS# STATUS NAME ---------- ------- ---------------------------------------------- 0 SYSTEM /opt/oradata/dfile/system01.dbf 1 ONLINE /opt/oradata/dfile/undotbs01.dbf 3 ONLINE /opt/oradata/dfile/indx01.dbf 4 ONLINE /opt/oradata/dfile/tools01.dbf 5 ONLINE /opt/oradata/dfile/users01.dbf 6 ONLINE /opt/oradata/dfile/ts_data_r_001g_001.dbf 6 rows selected. SQL> alter database open; Database altered. SQL> select tablespace_name ,file_name from dba_data_files; TABLESPACE_NAME FILE_NAME ------------------------------ ------------------------------------ SYSTEM /opt/oradata/dfile/system01.dbf UNDOTBS1 /opt/oradata/dfile/undotbs01.dbf INDX /opt/oradata/dfile/indx01.dbf TOOLS /opt/oradata/dfile/tools01.dbf USERS /opt/oradata/dfile/users01.dbf SQL> shutdown immediate; Database closed. Database dismounted. ORACLE instance shut down. SQL> startup ORACLE instance started. Total System Global Area 1646861168 bytes Fixed Size 453488 bytes Variable Size 687865856 bytes Database Buffers 956301312 bytes Redo Buffers 2240512 bytes Database mounted. Database opened. SQL> select tablespace_name ,file_name from dba_data_files; TABLESPACE_NAME FILE_NAME ------------------------------ ---------------------------------- SYSTEM /opt/oradata/dfile/system01.dbf UNDOTBS1 /opt/oradata/dfile/undotbs01.dbf INDX /opt/oradata/dfile/indx01.dbf TOOLS /opt/oradata/dfile/tools01.dbf USERS /opt/oradata/dfile/users01.dbf SQL> select TS#,status,name from v$datafile; TS# STATUS NAME ---------- ------- --------------------------------------------- 0 SYSTEM /opt/oradata/dfile/system01.dbf 1 ONLINE /opt/oradata/dfile/undotbs01.dbf 3 ONLINE /opt/oradata/dfile/indx01.dbf 4 ONLINE /opt/oradata/dfile/tools01.dbf 5 ONLINE /opt/oradata/dfile/users01.dbf 6 ONLINE /opt/oradata/dfile/ts_data_r_001g_001.dbf 6 rows selected. SQL> SQL> SQL> SQL> SQL> drop tablespace TS_DATA_R including contents and datafiles; drop tablespace TS_DATA_R including contents and datafiles * ERROR at line 1: ORA-00959: tablespace 'TS_DATA_R' does not exist SQL> alter database datafile '/opt/oradata/dfile/ts_data_r_001g_001.dbf'offline drop; Database altered. SQL> select TS#,status,name from v$datafile; TS# STATUS NAME ---------- ------- --------------------------------- 0 SYSTEM /opt/oradata/dfile/system01.dbf 1 ONLINE /opt/oradata/dfile/undotbs01.dbf 3 ONLINE /opt/oradata/dfile/indx01.dbf 4 ONLINE /opt/oradata/dfile/tools01.dbf 5 ONLINE /opt/oradata/dfile/users01.dbf 6 RECOVER /opt/oradata/dfile/ts_data_r_001g_001.dbf 6 rows selected. SQL> |
这里,终于看到ts_data_r_001g_001.dbf再次执行offline drop之后,状态变成recover了……(但是也奇怪,如果offline drop了,这里应该也是看不到才对呀)。
好,我们现在再试试重建表空间:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
SQL> create tablespace ts_data_r datafile '/opt/oradata/dfile/ts_data_r_001g_001.dbf' reuse 2 extent management local segment space management auto uniform size 1024k; create tablespace ts_data_r datafile '/opt/oradata/dfile/ts_data_r_001g_001.dbf' reuse * ERROR at line 1: ORA-01537: cannot add data file '/opt/oradata/dfile/ts_data_r_001g_001.dbf' - file already part of database SQL> alter database datafile 6 offline drop; Database altered. SQL> select TS#,status,name from v$datafile; TS# STATUS NAME ---------- ------- ------------------------------ 0 SYSTEM /opt/oradata/dfile/system01.dbf 1 ONLINE /opt/oradata/dfile/undotbs01.dbf 3 ONLINE /opt/oradata/dfile/indx01.dbf 4 ONLINE /opt/oradata/dfile/tools01.dbf 5 ONLINE /opt/oradata/dfile/users01.dbf 6 RECOVER /opt/oradata/dfile/ts_data_r_001g_001.dbf 6 rows selected. |
不幸的是,这个该死的ts_data_r_001g_001.dbf始终无法去掉!!
开始想偏方了……我为何不建一个表空间ts_data_r,且用另外一个数据文件ts_data_r_001g_002.dbf,等建立之后,我再drop tablespace,这样不就可以把ts_data_r_001g_001.dbf一起干掉了吗?
1 2 3 4 5 6 |
SQL> create tablespace ts_data_r datafile '/opt/oradata/dfile/ts_data_r_001g_002.dbf' size 1024m 2 extent management local segment space management auto uniform size 1024k; create tablespace ts_data_r datafile '/opt/oradata/dfile/ts_data_r_001g_002.dbf' size 1024m * ERROR at line 1: ORA-00603: ORACLE server session terminated by fatal error |
orz……终于把oracle惹毛了,直接把我的session断掉了!!此时去看alertlog中,一堆600的报错:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
…… [oracle@localhost bdump]$ tail -400 alert_ora9i.log ORA-00600: internal error code, arguments: [25013], [0], [6], [TS_DATA_R], [TS_DATA_R], [6], [7], [] Mon Mar 31 15:38:02 2008 Errors in file /oracle/app/oracle/admin/ora9i/udump/ora9i_ora_7816.trc: ORA-00600: internal error code, arguments: [kccocx_01], [], [], [], [], [], [], [] ORA-00600: internal error code, arguments: [25015], [6], [7], [6], [], [], [], [] ORA-00600: internal error code, arguments: [25013], [0], [6], [TS_DATA_R], [TS_DATA_R], [6], [7], [] Mon Mar 31 15:38:02 2008 Errors in file /oracle/app/oracle/admin/ora9i/udump/ora9i_ora_7816.trc: ORA-00600: internal error code, arguments: [25015], [6], [7], [6], [], [], [], [] ORA-00600: internal error code, arguments: [25015], [6], [7], [6], [], [], [], [] ORA-00600: internal error code, arguments: [25013], [0], [6], [TS_DATA_R], [TS_DATA_R], [6], [7], [] Mon Mar 31 15:38:02 2008 Errors in file /oracle/app/oracle/admin/ora9i/udump/ora9i_ora_7816.trc: ORA-00600: internal error code, arguments: [kccocx_01], [], [], [], [], [], [], [] ORA-00600: internal error code, arguments: [25015], [6], [7], [6], [], [], [], [] ORA-00600: internal error code, arguments: [25015], [6], [7], [6], [], [], [], [] ORA-00600: internal error code, arguments: [25013], [0], [6], [TS_DATA_R], [TS_DATA_R], [6], [7], [] Mon Mar 31 15:38:02 2008 Errors in file /oracle/app/oracle/admin/ora9i/udump/ora9i_ora_7816.trc: ORA-00600: internal error code, arguments: [kccocx_01], [], [], [], [], [], [], [] ORA-00600: internal error code, arguments: [25015], [6], [7], [6], [], [], [], [] ORA-00600: internal error code, arguments: [25015], [6], [7], [6], [], [], [], [] ORA-00600: internal error code, arguments: [25013], [0], [6], [TS_DATA_R], [TS_DATA_R], [6], [7], [] …… |
仔细看了一下报600错之前的alertlog,偶滴神!!原来alertlog中在意外宕机再次启动的时候,已经有报错信息和相关处理建议了。本来觉得v$datafile是mount时候能查到的视图,dba_data_files是open时候能查到的视图,2者不一致,能用绝招就是重建控制文件,但是……死活想用别的办法解决,就是不想重建控制文件!!-_-!!偶TMD还真固执:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
…… SMON: enabling tx recovery Fri Mar 28 11:51:03 2008 Database Characterset is ZHS16GBK replication_dependency_tracking turned off (no async multimaster replication found) Completed: ALTER DATABASE OPEN Mon Mar 31 14:52:03 2008 create tablespace ts_data_r datafile '/opt/oradata/dfile/ts_data_r_001g_001.dbf' size 1024m extent management local segment space management auto uniform size 1024k Mon Mar 31 14:52:12 2008 WARNING! CONTROLFILE SEQUENCE NUMBER TOO OLD, RE-READING... WARNING! CONTROLFILE SEQUENCE NUMBER TOO OLD, RE-READING... WARNING! CONTROLFILE SEQUENCE NUMBER TOO OLD, RE-READING... ********************* ATTENTION: ******************** The controlfile header block returned by the OS has a sequence number that is too old. The controlfile might be corrupted. PLEASE DO NOT ATTEMPT TO START UP THE INSTANCE without following the steps below. RE-STARTING THE INSTANCE CAN CAUSE SERIOUS DAMAGE TO THE DATABASE, if the controlfile is truly corrupted. In order to re-start the instance safely, please do the following: (1) Save all copies of the controlfile for later analysis and contact your OS vendor and Oracle support. (2) Mount the instance and issue: ALTER DATABASE BACKUP CONTROLFILE TO TRACE; (3) Unmount the instance. (4) Use the script in the trace file to RE-CREATE THE CONTROLFILE and open the database. ***************************************************** Instance terminated by DBW0, pid = 4437 …… |
最后,shutdown immedate,挂死,pmon切出大量600报错的日志,再尝试shutdown abort,终于关闭。再次startup,smon切出大量日志,提示数据库mount、数据库open,open之后,用alter database backup controlfile to trace as ‘/oracle/111.txt’获取重建控制文件的脚本,修改脚本中ts_data_r_001g_001.dbf存在的这个文件。再尝试shutdown immediate,正常关闭,再次startup 到nomount后重建控制文件,open resetlogs,终于……搞定了!:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
SQL> select TS#,status,name from v$datafile; TS# STATUS NAME ---------- ------- ---------------------- 0 SYSTEM /opt/oradata/dfile/system01.dbf 1 ONLINE /opt/oradata/dfile/undotbs01.dbf 3 ONLINE /opt/oradata/dfile/indx01.dbf 4 ONLINE /opt/oradata/dfile/tools01.dbf 5 ONLINE /opt/oradata/dfile/users01.dbf SQL> select tablespace_name ,file_name from dba_data_files; TABLESPACE_NAME FILE_NAME ------------------------------ --------------------- SYSTEM /opt/oradata/dfile/system01.dbf UNDOTBS1 /opt/oradata/dfile/undotbs01.dbf INDX /opt/oradata/dfile/indx01.dbf TOOLS /opt/oradata/dfile/tools01.dbf USERS /opt/oradata/dfile/users01.dbf SQL> SQL> SQL> create tablespace ts_data_r datafile '/opt/oradata/dfile/ts_data_r_001g_001.dbf' size 1024m reuse 2 extent management local segment space management auto uniform size 1024k; Tablespace created. |
本次故障解决的过程其实难度并不大,但是给了我们一个教训:对于异常宕机,再次启动,虽然在启动的时候不提示什么,但是只要是异常宕机,就一定要看alertlog!!
6条评论
alert应当实时监控,有错误就发告警短信和邮件
明显没有看我的blog
这个错误和提示,我们已经出现过一次了,哈哈
re NinGoo:你们alertlog是怎么做监控的呢,是grep ORA的关键字吗?
re David.Guo:你的网站偶也常去的,不过,不可能每篇文章都记得住呀~~
to NinGoo,俺们的db是电话通知,呵呵
没有24*7monitor scripts?真可怜的数据库…
其实,DBA不仅是个技术职位,管理和沟通也很重要,如果从管理制度上杜绝大于一个技术人员在同一时间对数据库进行操作就不会出现这个问题;同时,如果是很好的沟通,这个问题应该也可以避免。其实,绝大部分的问题都是人为操作不当,而混乱的管理和不畅的沟通扮演着重要角色。技术、沟通能力和项目管理能力是DBA,特别是高级DBA的必备技能。