最近的一段时间,某省的备份出了点问题,legato的备份在还原时检查不到索引,通过scanner扫描之后还是检查不到,问题已经开case提交到emc那边,目前还在处理。为了防止万一,还是对全网用legato备份的省进行了一次还原测试。以下是还原测试中遇到的一些问题小结。
目前现网大部分机器都是采用hp service guard,primary机上跑着生成,standby机器上已经装好了oracle软件,但是是处于standby状态。还原的内容是在备机上将spfile还原出来,大致的步骤如下:
1 2 3 4 5 6 7 8 9 10 11 |
1、在legato上的group中反击右键,在save set中看到备份脚本的路径,一般是在$ORACLE_HOME/bin下。 2、去$ORACLE_HOME/dbs下把原来的pfile做一下mv 3、rman target / catalog rman/rman@xxrman 这个登录的密码根据第1步中找到的登录方式,进行登录 4、做startup nomount 5、指定incarnation 6、还原spfile,把之前第1步的脚本改改,只分配一个channel就可以,把backup部分去掉,换成restore spfile。在这里有一些注意点,在后面会讲到。 7、检查和扫尾。 7.1、检查在dbs下是否有spfile的生成。 7.2.1、把恢复出来的spfile干掉 7.2.2、把initgzmisc.ora.bak 再mv回去。 7.2.3、停测试用的数据库 |
问题1:在备份脚本中虽然只写了NSR_CLIENT,没写NSR_SERVER,但是在还原测试的时候,需要写上NSR_SERVER。不然会报错:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
RMAN> run { 2> allocate channel t1 type 'sbt_tape' parms 'ENV=(NSR_CLIENT=gx_db)'; restore spfile; release channel t1; }3> 4> 5> 6> allocated channel: t1 channel t1: sid=9 devtype=SBT_TAPE channel t1: NMO v4.1.0.0 Starting restore at 30-DEC-09 channel t1: starting datafile backupset restore channel t1: restoring SPFILE output filename=/oracle/app/oracle/product/9.2.0/dbs/spfilegxmisc.ora released channel: t1 RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of restore command at 12/30/2009 14:04:00 ORA-19507: failed to retrieve sequential file, handle="db_GXMISC_t706434758_s43456_p1", parms="" ORA-27029: skgfrtrv: sbtrestore returned error ORA-19511: Error received from media manager layer, error text: nwora_open_restore: Could not locate the NWORA save file 'db_GXMISC_t706434758_ s43456_p1' on server 'GX_BAK01'. |
问题2:在主机2上做还原,但是脚本中的NSR_CLIENT还是得写备份时候的,如备份脚本中用的是xx_db01,那么在xx_db02上还原的时候,不能写NSR_CLIENT=xx_db02,应该写xx_db01.不然会报错:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
RMAN> run { 2> allocate channel t1 type 'sbt_tape' parms 'ENV=(NSR_CLIENT=sn_dc02)'; restore spfile; release channel t1; } 3> 4> 5> 6> allocated channel: t1 channel t1: sid=9 devtype=SBT_TAPE channel t1: NMO v4.1.0.0 Starting restore at 30-DEC-09 channel t1: starting datafile backupset restore channel t1: restoring SPFILE output filename=/oracle/app/oracle/product/9.2.0s/spfilesnmisc.ora released channel: t1 RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of restore command at 12/30/2009 15:10:31 ORA-19507: failed to retrieve sequential file, handle="db_SNMISC_t706845618_s19848_p1", parms="" ORA-27029: skgfrtrv: sbtrestore returned error ORA-19511: Error received from media manager layer, error text: nwora_open_restore: Could not locate the NWORA save file 'db_SNMISC_t706845618_s19848_p1' on server 'sn-bak01'. |
问题3:如果主机开启了安全加固,不允许rlogin,rexec之类的操作,且在xx_db01的脚本中配置的NSR_CLIENT不是浮动IP,那么在还原的时候也会报错。此时需要修改备份脚本,将NSR_CLIENT指定为浮动IP的hostname。
1 2 3 4 5 6 7 8 9 |
RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of restore command at 12/31/2009 11:39:02 ORA-19507: failed to retrieve sequential file, handle="db_GXMISC_t707018793_s43636_p1", parms="" ORA-27029: skgfrtrv: sbtrestore returned error ORA-19511: Error received from media manager layer, error text: nwora_open_restore: The NW authentication for client 'gx_db01' was refused by server 'gx_bak01' because 'User oracle on computer gx_db02 is not on gx_db01's remote access list'. RMAN> exit |
问题4:如果还原的时候,在client端的hosts没配置好也会报错:
在rman的脚本中会提示:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
RMAN> run { 2> allocate channel t1 type 'sbt_tape' parms 'ENV=(NSR_CLIENT=yn_db,NSR_SERVER=yn_bak)'; restore spfile; release channel t1; } 3> 4> 5> 6> allocated channel: t1 channel t1: sid=10 devtype=SBT_TAPE channel t1: NMO v4.1.0.0 Starting restore at 05-JAN-10 channel t1: starting datafile backupset restore channel t1: restoring SPFILE output filename=/oracle/app/oracle/product/9.2.0/dbs/spfileynmisc.ora okreleased channel: t1 RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of restore command at 01/05/2010 11:26:54 ORA-19507: failed to retrieve sequential file, handle="db_YNMISC_t707467406_s16566_p1", parms="" ORA-27029: skgfrtrv: sbtrestore returned error ORA-19511: Error received from media manager layer, error text: nwora_open_restore: Could not get mmd connection. RMAN> |
此时,在legato的界面中看到:
此时我们看到还原的时候,用了一个rd=yn_sdb的设备,我们在server从查到yn_sdb是一个审计数据库,而不是我们当前的生产库。
同时我们也在client端看到legato的log:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
oracle@yn_dc02:/nsr/applogs>ll total 16 -rw-r--r-- 1 oracle dba 849 Jan 5 11:26 nmo.messages oracle@yn_dc02:/nsr/applogs>more * SBT-7878 12/30/09 15:51:57 nwora_restore_mmdconn: could not set up mm binding: Unknown host SBT-7878 12/30/09 15:51:57 nwora_open_restore: Could not get mmd connection. SBT-3980 12/31/09 10:57:03 nwora_restore_mmdconn: could not set up mm binding: Unknown host SBT-3980 12/31/09 10:57:03 nwora_open_restore: Could not get mmd connection. SBT-5655 12/31/09 12:10:38 nwora_restore_mmdconn: could not set up mm binding: Unknown host SBT-5655 12/31/09 12:10:38 nwora_open_restore: Could not get mmd connection. SBT-12983 12/31/09 17:26:24 nwora_restore_mmdconn: could not set up mm binding: Unknown host SBT-12983 12/31/09 17:26:24 nwora_open_restore: Could not get mmd connection. SBT-18279 01/05/10 11:26:54 nwora_restore_mmdconn: could not set up mm binding: Unknown host SBT-18279 01/05/10 11:26:54 nwora_open_restore: Could not get mmd connection. oracle@yn_dc02:/nsr/applogs> |
有一个unknown的host,我们检查了该client端的hosts文件,发现确实没有yn_sdb的解析。
我们在做恢复的client端的hosts配上yn_sdb,再次测试还原,成功。
另外,检查了该server端的配置,发现remote driver的配置有误,dc01的第一个最后少了个字母b:
重新jbconfig之后,再次测试还原控制文件,发现没有在使用yn_sdb了: