今天某省的同事来找,说备份有问题。从log上看备份似乎成功了,但是在legato的界面中看还是一个红叉。于是登录上legato的界面再次测试了备份,发现monitor中日志如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
01/05/10 16:46:23 nsrd: savegroup info: starting OracleArch (with 1 client(s)) 01/05/10 16:46:42 savegrp: jx_db:/oracle/app/oracle/product/9.2.0/bin/oraclearch will retry 1 more time(s) 01/05/10 16:46:56 savegrp: jx_db:/oracle/app/oracle/product/9.2.0/bin/oraclearch will retry 0 more time(s) 01/05/10 16:46:56 nsrd: media info: suggest mounting 000700L2 on jx_bak01 for writing to pool 'index' 01/05/10 16:46:56 nsrd: media waiting event: Waiting for 1 writable volumes to backup pool 'index' tape(s) on jx_bak01 01/05/10 16:46:56 nsrd: media info: loading volume 000700L2 into \\.\Tape0 01/05/10 16:47:06 nsrmmd #13: Start nsrmmd #13, with PID 8020, at HOST jx_bak01 01/05/10 16:47:13 nsrd: \\.\Tape0 1:Verify label operation in progress 01/05/10 16:47:34 nsrd: \\.\Tape0 1:Mount operation in progress 01/05/10 16:48:00 nsrd: media event cleared: Waiting for 1 writable volumes to backup pool 'index' tape(s) on jx_bak01 01/05/10 16:48:00 nsrd: jx_bak01:index:jx_db saving to pool 'index' (000700L2) 01/05/10 16:48:01 nsrd: jx_bak01:index:jx_db done saving to pool 'index' (000700L2) 93 KB 01/05/10 16:48:06 nsrd: jx_bak01:bootstrap saving to pool 'index' (000700L2) 01/05/10 16:48:07 nsrmmdbd: media db is saving its data. This may take a while. 01/05/10 16:48:07 nsrmmdbd: media db is open for business. 01/05/10 16:48:08 nsrd: jx_bak01:bootstrap done saving to pool 'index' (000700L2) 2645 KB 01/05/10 16:48:15 nsrd: savegroup info: Added 'jx_bak01' to the group 'OracleArch' for bootstrap backup. 01/05/10 16:48:15 nsrd: savegroup alert: OracleArch completed, total 2 client(s), 0 Hostname(s) Unresolved, 1 Failed, 1 Succeeded. (jx_db Failed) |
从日志上看,legato的备份在没有进行rman的备份前,就直接去写index了。在正常情况下,legato应该是先用nmo调用rman的备份,备份完之后,再写index。
问了当地的同事,备份也确实没备份成功,归档日志还留在原地。
进一步观察backup server上c:\program files\legato\logs\daemon.log的信息:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
01/05/10 16:46:23 nsrd: savegroup info: starting OracleArch (with 1 client(s)) 01/05/10 16:46:42 savegrp: jx_db:/oracle/app/oracle/product/9.2.0/bin/oraclearch will retry 1 more time(s) 01/05/10 16:46:56 savegrp: jx_db:/oracle/app/oracle/product/9.2.0/bin/oraclearch will retry 0 more time(s) 01/05/10 16:46:56 nsrd: media info: suggest mounting 000700L2 on jx_bak01 for writing to pool 'index' 01/05/10 16:46:56 nsrd: media waiting event: Waiting for 1 writable volumes to backup pool 'index' tape(s) on jx_bak01 01/05/10 16:46:56 nsrd: media info: loading volume 000700L2 into \\.\Tape0 01/05/10 16:47:06 nsrmmd #13: Start nsrmmd #13, with PID 8020, at HOST jx_bak01 01/05/10 16:47:13 nsrd: \\.\Tape0 1:Verify label operation in progress 01/05/10 16:47:34 nsrd: \\.\Tape0 1:Mount operation in progress 01/05/10 16:48:00 nsrd: media event cleared: Waiting for 1 writable volumes to backup pool 'index' tape(s) on jx_bak01 01/05/10 16:48:00 nsrd: jx_bak01:index:jx_db saving to pool 'index' (000700L2) 01/05/10 16:48:01 nsrd: jx_bak01:index:jx_db done saving to pool 'index' (000700L2) 93 KB 01/05/10 16:48:06 nsrd: jx_bak01:bootstrap saving to pool 'index' (000700L2) 01/05/10 16:48:07 nsrmmdbd: media db is saving its data. This may take a while. 01/05/10 16:48:07 nsrmmdbd: media db is open for business. 01/05/10 16:48:08 nsrd: jx_bak01:bootstrap done saving to pool 'index' (000700L2) 2645 KB 01/05/10 16:48:15 nsrd: savegroup info: Added 'jx_bak01' to the group 'OracleArch' for bootstrap backup. 01/05/10 16:48:15 nsrd: savegroup alert: OracleArch completed, total 2 client(s), 0 Hostname(s) Unresolved, 1 Failed, 1 Succeeded. (jx_db Failed) * jx_db:/oracle/app/oracle/product/9.2.0/bin/oraclearch * jx_db:/oracle/app/oracle/product/9.2.0/bin/oraclearch nsrnmostart returned status of 255 * jx_db:/oracle/app/oracle/product/9.2.0/bin/oraclearch /opt/networker/bin/nsrnmo1 exiting. * jx_db:/oracle/app/oracle/product/9.2.0/bin/oraclearch RMAN script execution is not successful. RMAN exited with return code '1'. * jx_db:/oracle/app/oracle/product/9.2.0/bin/oraclearch 1 retry attempted * jx_db:/oracle/app/oracle/product/9.2.0/bin/oraclearch * jx_db:/oracle/app/oracle/product/9.2.0/bin/oraclearch nsrnmostart returned status of 255 * jx_db:/oracle/app/oracle/product/9.2.0/bin/oraclearch /opt/networker/bin/nsrnmo1 exiting. * jx_db:/oracle/app/oracle/product/9.2.0/bin/oraclearch RMAN script execution is not successful. RMAN exited with return code '1'. 01/05/10 16:48:16 nsrd: runq: NSR group OracleArch exited with return code 1. 01/05/10 16:48:16 nsrmmdbd: Starting compression of media database 01/05/10 16:48:16 nsrmmdbd: Finished compression of media database 01/05/10 16:48:16 nsrd: index notice: nsrim has finished cross checking the media db Blat v2.4 w/GSS encryption (build : Jan 15 2005 08:32:11) Sending d:\1.txt to weidc@aspire-tech.com, mon724@aspire-tech.com, jxsupport@aspire-tech.com Subject: LEGATO_warnning!! Login name is ss_mon@aspire-tech.com Try number 1 of 3. 01/05/10 16:48:54 nsrd: media warning: \\.\Tape0 reading: 有更多数据可用。 01/05/10 16:48:54 nsrd: media warning: \\.\Tape0 reading: 有更多数据可用。 01/05/10 16:48:54 nsrd: media warning: \\.\Tape0 reading: 有更多数据可用。 01/05/10 16:48:54 nsrd: media warning: \\.\Tape0 reading: 有更多数据可用。 01/05/10 16:49:09 nsrd: media info: verification of volume "000700L2", volid 3987777691 succeeded. 01/05/10 16:49:09 nsrd: write completion notice: Writing to volume 000700L2 complete |
在这里面我们看到“RMAN script execution is not successful. RMAN exited with return code ‘1’.”
于是我们到backup client,即db主机上去看备份的log:/nsr/applogs/msglog.log:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
connected to recovery catalog database connected to target database: JXMISC (DBID=2525447544) allocated channel: t1 channel t1: sid=486 devtype=SBT_TAPE channel t1: NMO v4.1.0.0 sent command to channel: t1 allocated channel: t2 channel t2: sid=434 devtype=SBT_TAPE channel t2: NMO v4.1.0.0 sent command to channel: t2 "msglog.log" [Incomplete last line] 658103 lines, 35000290 characters 13> release channel t1; 14> release channel t2; 15> } 16> connected to recovery catalog database connected to target database: JXMISC (DBID=2525447544) RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of allocate command at 01/05/2010 16:46:45 RMAN-03014: implicit resync of recovery catalog failed RMAN-03009: failure of partial resync command on default channel at 01/05/2010 16:46:45 ORA-01536: space quota exceeded for tablespace 'RMAN_DATA' |
发现原来是在同步catalog的时候,catalog库的rman用户的quato到了。我们进行以下处理:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
SQL> grant unlimited tablespace to rman; 授权成功。 SQL> Status Name Type Extent Man Total Size (M Used (M) Free (M) Used % --------- ------------------------------ --------- ---------- ------------- ------------- ------------- ----- - ONLINE SYSTEM PERMANENT LOCAL 500.000 192.875 307.125 38. 8 ONLINE RMAN_DATA PERMANENT LOCAL 500.000 83.000 417.000 16. 0 ONLINE TEMP TEMPORARY LOCAL 200.000 6.000 194.000 3. 0 ONLINE UNDOTBS1 UNDO LOCAL 200.000 1.500 198.500 0. 5 ONLINE TOOLS PERMANENT LOCAL 10.000 .063 9.938 0. 3 ONLINE INDX PERMANENT LOCAL 25.000 .063 24.938 0. 5 ONLINE USERS PERMANENT LOCAL 25.000 .063 24.938 0. 5 已选择7行。 SQL> |
再次启动备份,备份成功。