在rac环境中,会要求几个节点之间的时间差不能超时。一般如果超过30秒,节点很可能会重启。我们需要配置一个ntp时钟服务器,来给rac的各个节点进行时间同步。 在这里,我们以windows机器(192.168.1.189)做ntp server,以2台虚拟机上的rac节点(192.168.1.131 […]
安腾平台的legato的lib link
某省新上安腾平台的oracle,备份软件是legato,首先在安腾的机器上安装了networker.pkg和nmo.pkg,配置好client后,发现备份失败。查/nsr/applogs/msglog.log发现:
| 1 2 3 4 5 6 7 | RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03009: failure of allocate command on t1 channel at 06/29/2009 11:55:45 ORA-19554: error allocating device, device type: SBT_TAPE, device name:  ORA-27211: Failed to load Media Management Library Additional information: 25 | 
对于这个报错,我已 […]
rac安装报错和修复小结
一、rac安装步骤主要分以下几步: 1、安装操作系统。 2、配置时间同步,如ntp 3、创建oracle用户和配置rsa和dsa 4、配置hosts文件 5、配置hangcheck-timer 6、配置共享存储(ocr和voting disk) 7、安装clusterware(crs) 8、配置as […]
跨平台的DATAGUARD组合列表
oracle允许在同一个oracle platform下安装dataguard,要求尽量的同样的操作系统版本。 oracle platform可以用以下的语句查看(9i的v$database没有PLATFORM_ID和PLATFORM_NAME字段): [crayon-6904d38dc41e230 […]
swap空间不够导致连接侦听失败
早上4点多就被叫起来,说某现网的数据库侦听挂了。数据库连不上去,报以下的错误:
| 1 2 3 4 5 6 7 8 | $ sqlplus user/pwd@sid SQL*Plus: Release 9.2.0.6.0 - Production on Fri Jun 12 04:58:05 2009 Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved. ERROR: ORA-12500: TNS:listener failed to start a dedicated server process | 
登录后检查数据库的侦听进程还在,检查lsnrctl status的状态也是正常。 检查侦听的log发现,有大量连接拒连: [crayon-6904d38d […]
修改hosts后,rac通讯失败
今天收到个告警某省的一个数据库的一个节点down了,重启后,只能到started状态,数据库无法open,登录上去后,看到alertlog中:
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | Mon Jun 15 15:38:28 2009 Errors in file /oracle/app/oracle/admin/fjmisc/udump/fjmisc2_ora_26950.trc: ORA-00603: ORACLE server session terminated by fatal error ORA-27504: IPC error creating OSD context ORA-27300: OS system dependent operation:gethostbyname failed with status: 3 ORA-27301: OS failure message: No such process ORA-27302: failure occurred at: sskgxpmyip2 ORA-27303: additional information: nodename FJ_DB02  Mon Jun 15 15:41:19 2009 Errors in file /oracle/app/oracle/admin/fjmisc/udump/fjmisc2_ora_27062.trc: ORA-00603: ORACLE server session terminated by fatal error ORA-27504: IPC error creating OSD context ORA-27300: OS system dependent operation:gethostbyname failed with status: 3 ORA-27301: OS failure message: No such process ORA-27302: failure occurred at: sskgxpmyip2 ORA-27303: additional information: nodename FJ_DB02 Mon Jun 15 15:42:37 2009 Errors in file /oracle/app/oracle/admin/fjmisc/udump/fjmisc2_ora_27147.trc: ORA-00603: ORACLE server session terminated by fatal error ORA-27504: IPC error creating OSD context ORA-27300: OS system dependent operation:gethostbyname failed with status: 3 ORA-27301: OS failure message: No such process ORA-27302: failure occurred at: sskgxpmyip2 ORA-27303: additional information: nodename FJ_DB02  Mon Jun 15 15:46:19 2009 Errors in file /oracle/app/oracle/admin/fjmisc/udump/fjmisc2_ora_27362.trc: ORA-00603: ORACLE server session terminated by fatal error ORA-27504: IPC error creating OSD context | 
当时第一个反映是gethostbyname failed 应该是主机名解析有问题了。但是p […]
主机安全加固导致dp分发客户端失败
今年的315晚会,曝光了某省移动的一些违规操作后,移动集团公司对各地分公司都进行了安全检查。在进行一系列的安全加固操作后,我们发现,还是有部分之前很顺利的操作会受到影响。 问题的起因是这样的,某省的备份软件是HP的DP。DP能对数据库进行备份,也能对文件系统进行备份(这样的功能,我想一般的备份软件都 […]
用hanganalyze解决row cache lock
今天某省的一位同事来说,在执行一个split分区的脚本时长时间没有响应。登录上去查看,手工执行了split脚本,发现确实会hang住:
| 1 2 3 4 5 | SQL> l   1  ALTER TABLE A_PDA_SP_STAT SPLIT PARTITION P_MAX AT (20090609)   2   INTO (PARTITION P_20090608 TABLESPACE TS_DATA_A   3*  , PARTITION P_MAX TABLESPACE TS_DATA_A) SQL> | 
检查该session的等待事件:
| 1 2 3 | EVENT                                  P1         P2         P3 ------------------------------ ---------- ---------- ---------- row cache lock                          8          0          5 | 
[…]
rman备份发现坏块之后的处理
今天收到某省的备份发生失败的告警:
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | …… 四月 16 16:30:19 ur_bak01: NetWorker savegroup: (alert) urmdborafull completed, total 2 client(s), 0 Hostname(s) Unresolved, 1 Failed, 1 Succeeded. (ur_mdb01 Failed) RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03009: failure of backup command on t2 channel at 03/20/2009 13:49:09 ORA-19566: exceeded limit of 0 corrupt blocks for file /dev/vg_mdb02/rdata_2g_050 ORA-000060: Deadlock detected. More info in file /oracle/app/oracle/admin/uradt/udump/uradt_ora_3035.trc. *** Corrupt block relative dba: 0x1a43d4e3 (file 105, block 251107) Fractured block found during backing up datafile Data in bad block -  type: 0 format: 0 rdba: 0x00000000  last change scn: 0x0000.00000000 seq: 0x0 flg: 0x00  consistency value in tail: 0x00000000  check value in block header: 0x0, block checksum disabled  spare1: 0x0, spare2: 0x0, spare3: 0x0 *** Reread of blocknum=251107, file=/dev/vg_mdb02/rdata_2g_050. found same corrupt data Thu Apr 16 16:31:04 2009 …… | 
用dbv检查发现有至少有45个坏块:
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 | [oracle@ur_mdb01 /oracle$]dbv file=/dev/vg_mdb02/rdata_2g_050 BLOCKSIZE=8192 DBVERIFY: Release 9.2.0.6.0 - Production on Mon Apr 20 09:55:07 2009 Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved. DBVERIFY - Verification starting : FILE = /dev/vg_mdb02/rdata_2g_050 Page 251107 is influx - most likely media corrupt *** Corrupt block relative dba: 0x1a43d4e3 (file 105, block 251107) Fractured block found during dbv:  Data in bad block -  type: 0 format: 0 rdba: 0x00000000  last change scn: 0x0000.00000000 seq: 0x0 flg: 0x00  consistency value in tail: 0x00000000  check value in block header: 0x0, block checksum disabled  spare1: 0x0, spare2: 0x0, spare3: 0x0 *** Page 251108 is marked corrupt *** Corrupt block relative dba: 0x1a43d4e4 (file 105, block 251108) Bad header found during dbv:  Data in bad block -  type: 181 format: 6 rdba: 0x00000000  last change scn: 0x0000.00000000 seq: 0x0 flg: 0x00  consistency value in tail: 0x00000000  check value in block header: 0x0, block checksum disabled  spare1: 0x7, spare2: 0xc, spare3: 0x0 *** …… Corrupt block relative dba: 0x1a43d56f (file 105, block 251247) Bad header found during dbv:  Data in bad block -  type: 65 format: 5 rdba: 0x527002c2  last change scn: 0x3131.02063033 seq: 0x30 flg: 0x31  consistency value in tail: 0x3635032d  check value in block header: 0x180, block checksum disabled  spare1: 0x50, spare2: 0x72, spare3: 0x430 *** DBVERIFY - Verification complete Total Pages Examined         : 262016 Total Pages Processed (Data) : 60240 Total Pages Failing   (Data) : 0 Total Pages Processed (Index): 0 Total Pages Failing   (Index): 0 Total Pages Processed (Other): 568 Total Pages Processed (Seg)  : 0 Total Pages Failing   (Seg)  : 0 Total Pages Empty            : 201163 Total Pages Marked Corrupt   : 45 Total Pages Influx           : 11 Highest block SCN            : 10816042273 (2.2226107681) | 
经检查,发现这些坏块上没有任何数据对象: [crayon-6904d38dc71ce91913 […]
opatch被异常中断后的处理
今天在打一个patch的时候,已经是用opatch执行到了最后一个patch,一时手欠,没看清楚telnet的窗口,不小心按下了ctrl+C,于是,opatch被中断了。汗了,晚节不保啊! 再次执行opatch apply,报错:
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 | $ opatch apply 6163771 Invoking OPatch 10.2.0.4.6 Oracle Interim Patch Installer version 10.2.0.4.6 Copyright (c) 2009, Oracle Corporation.  All rights reserved. Oracle Home       : /oracle/app/oracle/product/10.2.0/db_1 Central Inventory : /oracle/app/oracle/oraInventory    from           : /var/opt/oracle/oraInst.loc OPatch version    : 10.2.0.4.6 OUI version       : 10.2.0.4.0 OUI location      : /oracle/app/oracle/product/10.2.0/db_1/oui Log file location : /oracle/app/oracle/product/10.2.0/db_1/cfgtoollogs/opatch/opatch2009-04-08_16-29-59PM.log Patch history file: /oracle/app/oracle/product/10.2.0/db_1/cfgtoollogs/opatch/opatch_history.txt ApplySession applying interim patch '6163771' to OH '/oracle/app/oracle/product/10.2.0/db_1' Running prerequisite checks... OPatch detected non-cluster Oracle Home from the inventory and will patch the local system only. Please shutdown Oracle instances running out of this ORACLE_HOME on the local system. (Oracle Home = '/oracle/app/oracle/product/10.2.0/db_1') Is the local system ready for patching? [y|n] Y User Responded with: Y Backing up files and inventory (not for auto-rollback) for the Oracle Home Backing up files affected by the patch '6163771' for restore. This might take a while... Backing up files affected by the patch '6163771' for rollback. This might take a while... Patching component oracle.rdbms, 10.2.0.4.0... Updating archive file "/oracle/app/oracle/product/10.2.0/db_1/lib/libserver10.a"  with "lib/libserver10.a/kfc.o" Updating archive file "/oracle/app/oracle/product/10.2.0/db_1/lib/libserver10.a"  with "lib/libserver10.a/kfcb.o" Updating archive file "/oracle/app/oracle/product/10.2.0/db_1/lib/libserver10.a"  with "lib/libserver10.a/kfcl.o" Updating archive file "/oracle/app/oracle/product/10.2.0/db_1/lib/libserver10.a"  with "lib/libserver10.a/kfr.o" Updating archive file "/oracle/app/oracle/product/10.2.0/db_1/lib/libserver10.a"  with "lib/libserver10.a/kfrb.o" The following actions have failed: Archive not applied /arch/ora_patch/patch_ia/7409356/6163771/files/lib/libserver10.a/kfc.o to /oracle/app/oracle/product/10.2.0/db_1/lib/libserver10.a... '' Archive not applied /arch/ora_patch/patch_ia/7409356/6163771/files/lib/libserver10.a/kfcb.o to /oracle/app/oracle/product/10.2.0/db_1/lib/libserver10.a... '' Archive not applied /arch/ora_patch/patch_ia/7409356/6163771/files/lib/libserver10.a/kfcl.o to /oracle/app/oracle/product/10.2.0/db_1/lib/libserver10.a... '' Archive not applied /arch/ora_patch/patch_ia/7409356/6163771/files/lib/libserver10.a/kfr.o to /oracle/app/oracle/product/10.2.0/db_1/lib/libserver10.a... '' Archive not applied /arch/ora_patch/patch_ia/7409356/6163771/files/lib/libserver10.a/kfrb.o to /oracle/app/oracle/product/10.2.0/db_1/lib/libserver10.a... '' Do you want to proceed? [y|n] Y    User Responded with: Y Running make for target ioracle Make failed to invoke "/usr/ccs/bin/make -f ins_rdbms.mk ioracle ORACLE_HOME=/oracle/app/oracle/product/10.2.0/db_1"....'ld: I/O error, file "/oracle/app/oracle/product/10.2.0/db_1/lib//libserver10.a":  Fatal error. Stop. ' The following make actions have failed : Re-link fails on target "ioracle". Do you want to proceed? [y|n] Y User Responded with: Y ApplySession adding interim patch '6163771' to inventory Verifying the update... Inventory check OK: Patch ID 6163771 is registered in Oracle Home inventory with proper meta-data. ApplySession failed: ApplySession failed in system modification phase... 'Verification of patch failed: Error verification failed: ar: kfc.o not found ' OPatch will attempt to restore the system... Restoring the Oracle Home... Checking if OPatch needs to invoke 'make' to restore some binaries... Make failed to invoke "/usr/ccs/bin/make -f ins_rdbms.mk ioracle ORACLE_HOME=/oracle/app/oracle/product/10.2.0/db_1"....'ld: I/O error, file "/oracle/app/oracle/product/10.2.0/db_1/lib//libserver10.a":  Fatal error. Stop. ' -------------------------------------------------------------------------------- Failed to run make commands. They are stored in file '/oracle/app/oracle/product/10.2.0/db_1/.patch_storage/6163771_Jun_19_2008_17_41_45/make.txt' Invoke these commands manually to restore the binaries in the Oracle Home. OPatch failed to restore OH '/oracle/app/oracle/product/10.2.0/db_1'. Consult OPatch document to restore the home manually before proceeding. -------------------------------------------------------------------------------- The following warnings have occurred during OPatch execution: 1) OUI-67124:Archive not applied /arch/ora_patch/patch_ia/7409356/6163771/files/lib/libserver10.a/kfc.o to /oracle/app/oracle/product/10.2.0/db_1/lib/libserver10.a... '' Archive not applied /arch/ora_patch/patch_ia/7409356/6163771/files/lib/libserver10.a/kfcb.o to /oracle/app/oracle/product/10.2.0/db_1/lib/libserver10.a... '' Archive not applied /arch/ora_patch/patch_ia/7409356/6163771/files/lib/libserver10.a/kfcl.o to /oracle/app/oracle/product/10.2.0/db_1/lib/libserver10.a... '' Archive not applied /arch/ora_patch/patch_ia/7409356/6163771/files/lib/libserver10.a/kfr.o to /oracle/app/oracle/product/10.2.0/db_1/lib/libserver10.a... '' Archive not applied /arch/ora_patch/patch_ia/7409356/6163771/files/lib/libserver10.a/kfrb.o to /oracle/app/oracle/product/10.2.0/db_1/lib/libserver10.a... '' 2) OUI-67200:Make failed to invoke "/usr/ccs/bin/make -f ins_rdbms.mk ioracle ORACLE_HOME=/oracle/app/oracle/product/10.2.0/db_1"....'ld: I/O error, file "/oracle/app/oracle/product/10.2.0/db_1/lib//libserver10.a":  Fatal error. Stop. ' 3) OUI-67124:Re-link fails on target "ioracle". 4) OUI-67200:Make failed to invoke "/usr/ccs/bin/make -f ins_rdbms.mk ioracle ORACLE_HOME=/oracle/app/oracle/product/10.2.0/db_1"....'ld: I/O error, file "/oracle/app/oracle/product/10.2.0/db_1/lib//libserver10.a":  Fatal error. Stop. ' -------------------------------------------------------------------------------- OPatch failed with error code 115 $  | 
用 […]
listener的内存泄漏
今天收到告警邮件,某省的数据库无法登录,在alertlog中有如下的报错:
| 1 2 3 4 5 6 | Tue Mar 31 16:38:11 2009 Errors in file /oracle/app/oracle/admin/zjfs/bdump/zjfs_ora_22423.trc: ORA-27102: out of memory HPUX-ia64 Error: 12: Not enough space Additional information: 103 Additional information: 524288 | 
登录数据库主机检查内存剩余量不到10%,用top检查:
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | System: zj-db01                                       Wed Apr  1 14:10:47 2009 Load averages: 0.12, 0.18, 0.28 387 processes: 362 sleeping, 25 running Cpu states: CPU   LOAD   USER   NICE    SYS   IDLE  BLOCK  SWAIT   INTR   SSYS  0    0.21   8.4%   0.0%   7.0%  84.7%   0.0%   0.0%   0.0%   0.0%  1    0.09   3.4%   0.0%   2.4%  94.2%   0.0%   0.0%   0.0%   0.0%  2    0.11   3.4%   0.0%   0.0%  96.6%   0.0%   0.0%   0.0%   0.0%  3    0.10   3.2%   0.0%   1.0%  95.8%   0.0%   0.0%   0.0%   0.0% ---   ----  -----  -----  -----  -----  -----  -----  -----  ----- avg   0.12   4.6%   0.0%   2.6%  92.8%   0.0%   0.0%   0.0%   0.0% Memory: 4456068K (2521320K) real, 6115756K (3135960K) virtual, 391416K free  Page# 1/15 CPU TTY     PID USERNAME PRI NI   SIZE    RES STATE    TIME %WCPU  %CPU COMMAND  0   ?     2307 oracle   178 20  3015M 24612K sleep 4383:34 15.11 15.08 oraclezjfs  2   ?     2228 oracle   178 20  2986M  9624K sleep 3580:19  2.94  2.93 oraclezjfs  1   ?     2309 oracle   178 20  2986M  5536K sleep  541:29  2.92  2.91 oraclezjfs  1   ?     4400 oracle   154 20  1375M  1340M sleep 11224:04  2.04  2.04 tnslsnr    1   ?     2528 oracle   178 20  2986M  5612K sleep   15:05  1.32  1.32 oraclezjfs     1   ?     4380 oracle   178 20  3009M 35136K sleep  268:00  1.16  1.16 ora_lgwr_zjfs  3   ?     4378 oracle   178 20  3009M 35152K sleep  244:54  0.74  0.74 ora_dbw0_zjfs  1   ?       54 root     152 20  3312K  2944K run    136:01  0.45  0.45 vxfsd  3   ?     1439 root     152 20   207M 84724K run    103:21  0.40  0.40 cimprovagt  3   ?     1442 root     152 20 38312K  2820K run   2725:46  0.36  0.36 cimprovagt     3   ?    12064 oracle   178 20  2986M  5264K sleep    0:20  0.32  0.32 oraclezjfs     2   ?     1436 root     152 20 56792K 13976K run    180:41  0.23  0.23 cimserver   3 pts/ta 21448 oracle   168 20 10836K  1284K sleep    0:00  0.22  0.22 top  2   ?     2381 oracle   178 20  2987M  5944K sleep  419:20  0.18  0.18 oraclezjfs  1   ?       38 root     152 20   432K   384K run     60:21  0.16  0.16 schedcpu    2 pts/tc 21573 oracle   178 20 10964K  1412K run      0:00  0.27  0.15 top  2   ?    21551 oracle   178 20  2986M  5260K sleep    0:00  0.16  0.14 oraclezjfs  3   ?     1793 root     152 20   113M 17208K run      6:09  0.14  0.14 vxsvc  0   ?       20 root     191 20   360K   320K run     33:43  0.13  0.13 ksyncer_daemon  2   ?     1429 root     152 20 25516K  5536K run      4:42  0.12  0.12 rpcd  1   ?     4297 root     -27 20 46772K 38548K run     31:58  0.12  0.12 cmcld  2   ?    21518 oracle   178 20  2986M  5260K sleep    0:00  0.13  0.12 oraclezjfs  2   ?     1228 root     154 20  7812K   848K sleep  126:00  0.10  0.10 sendmail:  2   ?       39 root     191 20   288K   256K run    305:44  0.08  0.08 pagezerod  2   ?     1589 root     152 20 25072K  3992K run      1:20  0.08  0.08 swagentd  $  $  | 
发现listener占用的内存非常大 […]
安装数据库时报错无法写入
今天在安装数据库的时候,报错文件无法写入: 一开始想,是在copy的时候报错,是不是安装介质的缘故,难道是ftp传输的时候有问题?由于之前是通过写ftp脚本挂后台跑,log中虽然没什么报错,但是以防万一,还是再传了一次。 但是安装到27%,还是报错了,虽然不是报同样的一个文件write error, […]