一个问题会牵涉出更多问题。呵呵。
最初的问题是,asm如何更换ocr和votedisk到新存储?
(一)首先官方推荐了OCR / Vote disk Maintenance Operations: (ADD/REMOVE/REPLACE/MOVE) (Doc ID 428681.1)
(1) OCR:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
[root@rac2 Desktop]# ocrconfig -add +OCRVOT2 [root@rac2 Desktop]# [root@rac2 Desktop]# ocrcheck Status of Oracle Cluster Registry is as follows : Version : 3 Total space (kbytes) : 262120 Used space (kbytes) : 3772 Available space (kbytes) : 258348 ID : 1709296165 Device/File Name : +OCRVOT Device/File integrity check succeeded Device/File Name : +OCRVOT2 Device/File integrity check succeeded Device/File not configured Device/File not configured Device/File not configured Cluster registry integrity check succeeded Logical corruption check succeeded [root@rac2 Desktop]# [root@rac2 Desktop]# ocrconfig -delete +OCRVOT [root@rac2 Desktop]# ocrcheck Status of Oracle Cluster Registry is as follows : Version : 3 Total space (kbytes) : 262120 Used space (kbytes) : 3772 Available space (kbytes) : 258348 ID : 1709296165 Device/File Name : +OCRVOT2 Device/File integrity check succeeded Device/File not configured Device/File not configured Device/File not configured Device/File not configured Cluster registry integrity check succeeded Logical corruption check succeeded [root@rac2 Desktop]# |
注1,compatible.asm需要高于11.1,不然ocrconfig add的时候会报错:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 |
[root@rac2 Desktop]# ocrconfig -add +OCRVOT2 PROT-30: The Oracle Cluster Registry location to be added is not usable PROC-8: Cannot perform cluster registry operation because one of the parameters is invalid. ORA-15056: additional error message ORA-17502: ksfdcre:4 Failed to create file +OCRVOT2.255.1 ORA-15221: ASM operation requires compatible.asm of 11.1.0.0.0 or higher ORA-06512: at line 4 [root@rac2 Desktop]# SQL> l 1* select name,COMPATIBILITY,DATABASE_COMPATIBILITY from v$asm_diskgroup SQL> / NAME COMPATIBILITY DATABASE_COMPATIBILITY ------------------------------ --------------- ------------------------------------------------------------ ACFS 11.2.0.0.0 10.1.0.0.0 DATA 11.2.0.0.0 11.2.0.0.0 FRA 11.2.0.0.0 11.2.0.0.0 OCRVOT2 10.1.0.0.0 10.1.0.0.0 OCRVOT 11.2.0.0.0 11.2.0.0.0 SQL> SQL> alter diskgroup OCRVOT2 set attribute 'compatible.asm'='11.2'; Diskgroup altered. SQL> c/asm/rdbms 1* alter diskgroup OCRVOT2 set attribute 'compatible.rdbms'='11.2' SQL> / Diskgroup altered. SQL> select name,COMPATIBILITY,DATABASE_COMPATIBILITY from v$asm_diskgroup; NAME COMPATIBILITY DATABASE_COMPATIBILITY ------------------------------ --------------- ------------------------------------------------------------ ACFS 11.2.0.0.0 10.1.0.0.0 DATA 11.2.0.0.0 11.2.0.0.0 FRA 11.2.0.0.0 11.2.0.0.0 OCRVOT2 11.2.0.0.0 11.2.0.0.0 OCRVOT 11.2.0.0.0 11.2.0.0.0 SQL> |
注2,replace命令似乎在11.2.0.4上不行,所以我用了add再delete的方式,不是用文档的replace方式:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 |
[root@rac2 Desktop]# ocrcheck Status of Oracle Cluster Registry is as follows : Version : 3 Total space (kbytes) : 262120 Used space (kbytes) : 3756 Available space (kbytes) : 258364 ID : 1709296165 Device/File Name : +OCRVOT Device/File integrity check succeeded Device/File not configured Device/File not configured Device/File not configured Device/File not configured Cluster registry integrity check succeeded Logical corruption check succeeded [root@rac2 Desktop]# --直接replace不行 [root@rac2 Desktop]# ocrconfig -replace +OCRVOT -replacement +OCRVOT2 PROT-28: Cannot delete or replace the only configured Oracle Cluster Registry location [root@rac2 Desktop]# [root@rac2 Desktop]# --试试先加一个diskgroup: [root@rac2 Desktop]# ocrconfig -add +OCRVOT2 [root@rac2 Desktop]# ocrcheck Status of Oracle Cluster Registry is as follows : Version : 3 Total space (kbytes) : 262120 Used space (kbytes) : 3772 Available space (kbytes) : 258348 ID : 1709296165 Device/File Name : +OCRVOT Device/File integrity check succeeded Device/File Name : +OCRVOT2 Device/File integrity check succeeded Device/File not configured Device/File not configured Device/File not configured Cluster registry integrity check succeeded Logical corruption check succeeded [root@rac2 Desktop]# [root@rac2 Desktop]# [root@rac2 Desktop]# [root@rac2 Desktop]# --加好后,再replace,也还是不行。 [root@rac2 Desktop]# ocrconfig -replace +OCRVOT -replacement +OCRVOT2 PROT-29: The Oracle Cluster Registry location is already configured |
(2) VOTEDISK:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
[root@rac2 Desktop]# crsctl query css votedisk ## STATE File Universal Id File Name Disk group -- ----- ----------------- --------- --------- 1. ONLINE 873d06b1ab5f4f98bf28f80b24cbb982 (/dev/asm-ocrvot1) [OCRVOT] 2. ONLINE 115fc78773894fb2bf7ea02fe6442bb8 (/dev/asm-ocrvot2) [OCRVOT] 3. ONLINE a2afe4f168b44f3dbfdaccdd084a0dd8 (/dev/asm-ocrvot3) [OCRVOT] Located 3 voting disk(s). [root@rac2 Desktop]# [root@rac2 Desktop]# crsctl replace votedisk +OCRVOT2 Successful addition of voting disk 1aed1b60136c4fe5bf1f5c98d0ac0cfe. Successful addition of voting disk 3a7e1165d4794f4bbf48e946866990de. Successful addition of voting disk 61a1d5e1e9a84fddbf558e527221b244. Successful deletion of voting disk 873d06b1ab5f4f98bf28f80b24cbb982. Successful deletion of voting disk 115fc78773894fb2bf7ea02fe6442bb8. Successful deletion of voting disk a2afe4f168b44f3dbfdaccdd084a0dd8. Successfully replaced voting disk group with +OCRVOT2. CRS-4266: Voting file(s) successfully replaced [root@rac2 Desktop]# [root@rac2 Desktop]# crsctl query css votedisk ## STATE File Universal Id File Name Disk group -- ----- ----------------- --------- --------- 1. ONLINE 1aed1b60136c4fe5bf1f5c98d0ac0cfe (/dev/asm-ex_ocrvot_01) [OCRVOT2] 2. ONLINE 3a7e1165d4794f4bbf48e946866990de (/dev/asm-ex_ocrvot_02) [OCRVOT2] 3. ONLINE 61a1d5e1e9a84fddbf558e527221b244 (/dev/asm-ex_ocrvot_03) [OCRVOT2] Located 3 voting disk(s). [root@rac2 Desktop]# |
(二)其次,如果我不新建OCRVOT2的diskgroup。只是对原diskgroup加盘减盘通过rebalance来实现迁移,是否可行?
答案是可行的,见文档:Exact Steps To Migrate ASM Diskgroups To Another SAN/DiskArray/DAS/etc Without Downtime. (Doc ID 837308.1) 或者 Adding new storage disks and Dropping old storage disks from OCR ,Vote diskgroup (Doc ID 2073993.1)
但是这里引出两个知识点:1.ocr是属于asm file的,可以通过rebalance来实现迁移;2. votedisk不属于asm file,不能通过加减盘的rebalance来实现迁移;但是oracle会自动帮你copy原votedisk到新盘上,这个动作的触发,是发生在减盘的时候。
我们来测试一下:
|
以下3个盘是我需要加进去的盘,用来exchange原来的盘。 [oracle@rac1 ~]$ kfod disk=all -------------------------------------------------------------------------------- Disk Size Path User Group ================================================================================ 1: 6142 Mb /dev/asm-acfs01 oracle dba 2: 6142 Mb /dev/asm-acfs02 oracle dba 3: 1019 Mb /dev/asm-data01 oracle dba 4: 1019 Mb /dev/asm-data02 oracle dba 5: 1019 Mb /dev/asm-data03 oracle dba 6: 1019 Mb /dev/asm-data04 oracle dba 7: 1019 Mb /dev/asm-data05 oracle dba 8: 698 Mb /dev/asm-ex_ocrvot_01 oracle dba <== 9: 698 Mb /dev/asm-ex_ocrvot_02 oracle dba <== 10: 698 Mb /dev/asm-ex_ocrvot_03 oracle dba <== 11: 1019 Mb /dev/asm-fra01 oracle dba 12: 1019 Mb /dev/asm-fra02 oracle dba 13: 1019 Mb /dev/asm-fra03 oracle dba 14: 1019 Mb /dev/asm-fra04 oracle dba 15: 611 Mb /dev/asm-ocrvot1 oracle dba 16: 611 Mb /dev/asm-ocrvot2 oracle dba 17: 611 Mb /dev/asm-ocrvot3 oracle dba -------------------------------------------------------------------------------- ORACLE_SID ORACLE_HOME ================================================================================ +ASM1 /u01/app/11.2.0.3/grid [oracle@rac1 ~]$ [oracle@rac1 ~]$ SQL> select name,group_number,type,state from v$asm_diskgroup order by 2; NAME GROUP_NUMBER TYPE STATE ------------------------------ ------------ ------ ----------- ACFS 1 EXTERN MOUNTED DATA 2 EXTERN MOUNTED FRA 3 EXTERN MOUNTED OCRVOT 4 NORMAL MOUNTED SQL> SQL> SQL> select name,state,REDUNDANCY,path,GROUP_NUMBER from v$asm_disk order by GROUP_NUMBER,name; NAME STATE REDUNDA PATH GROUP_NUMBER ------------------------------ -------- ------- ---------------------------------------- ------------ NORMAL UNKNOWN /dev/asm-ex_ocrvot_02 0 NORMAL UNKNOWN /dev/asm-ex_ocrvot_01 0 NORMAL UNKNOWN /dev/asm-ex_ocrvot_03 0 NORMAL UNKNOWN /dev/asm-data05 0 ACFS_0000 NORMAL UNKNOWN /dev/asm-acfs01 1 ACFS_0001 NORMAL UNKNOWN /dev/asm-acfs02 1 DATA_0000 NORMAL UNKNOWN /dev/asm-data01 2 DATA_0001 NORMAL UNKNOWN /dev/asm-data02 2 DATA_0002 NORMAL UNKNOWN /dev/asm-data03 2 DATA_0003 NORMAL UNKNOWN /dev/asm-data04 2 FRA_0000 NORMAL UNKNOWN /dev/asm-fra01 3 FRA_0001 NORMAL UNKNOWN /dev/asm-fra02 3 FRA_0002 NORMAL UNKNOWN /dev/asm-fra03 3 FRA_0003 NORMAL UNKNOWN /dev/asm-fra04 3 OCRVOT_0000 NORMAL UNKNOWN /dev/asm-ocrvot1 4 OCRVOT_0001 NORMAL UNKNOWN /dev/asm-ocrvot2 4 OCRVOT_0002 NORMAL UNKNOWN /dev/asm-ocrvot3 4 17 rows selected. SQL> --开始加盘: [oracle@rac1 ~]$ sqlplus "/as sysasm" SQL*Plus: Release 11.2.0.4.0 Production on Thu Jun 16 12:01:40 2016 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options --可以看到,ocr是属于asm file的,根据定义,asm file是会做rebalance的。 SQL> l 1* select file_number,GROUP_NUMBER,type from v$asm_file where type not in ('ARCHIVELOG','DATAFILE','ONLINELOG','CONTROLFILE') order by GROUP_NUMBER SQL> / FILE_NUMBER GROUP_NUMBER TYPE ----------- ------------ -------------------- 256 1 ASMVOL 253 2 ASMPARAMETERFILE 273 2 PARAMETERFILE 279 2 TEMPFILE 275 3 PARAMETERFILE 255 4 OCRFILE 6 rows selected. SQL> alter diskgroup OCRVOT add disk '/dev/asm-ex_ocrvot_01' size 600m; Diskgroup altered. SQL> alter diskgroup OCRVOT add disk '/dev/asm-ex_ocrvot_02' size 600m; Diskgroup altered. SQL> c/2/3 1* alter diskgroup OCRVOT add disk '/dev/asm-ex_ocrvot_03' size 600m SQL> / Diskgroup altered. --可以看到有rebalance的动作 SQL> select * from v$asm_operation; GROUP_NUMBER OPERA STAT POWER ACTUAL SOFAR EST_WORK EST_RATE ------------ ----- ---- ---------- ---------- ---------- ---------- ---------- EST_MINUTES ERROR_CODE ----------- -------------------------------------------- 4 REBAL RUN 1 1 0 323 0 0 SQL> / GROUP_NUMBER OPERA STAT POWER ACTUAL SOFAR EST_WORK EST_RATE ------------ ----- ---- ---------- ---------- ---------- ---------- ---------- EST_MINUTES ERROR_CODE ----------- -------------------------------------------- 4 REBAL RUN 1 1 87 323 404 0 SQL> / GROUP_NUMBER OPERA STAT POWER ACTUAL SOFAR EST_WORK EST_RATE ------------ ----- ---- ---------- ---------- ---------- ---------- ---------- EST_MINUTES ERROR_CODE ----------- -------------------------------------------- 4 REBAL RUN 1 1 236 323 412 0 SQL> SQL> / GROUP_NUMBER OPERA STAT POWER ACTUAL SOFAR EST_WORK EST_RATE ------------ ----- ---- ---------- ---------- ---------- ---------- ---------- EST_MINUTES ERROR_CODE ----------- -------------------------------------------- 4 REBAL RUN 1 1 323 323 0 0 SQL> --reblanace完成 SQL> / no rows selected SQL> SQL> --看到 255号asm file,即ocr file,是分配到了6个disk上。即完成了rebalance。 SQL> select number_kfdat from x$kfdat where group_kfdat=4 and fnum_kfdat=255 group by number_kfdat; NUMBER_KFDAT ------------ 0 1 2 3 4 5 6 rows selected. SQL> --看到votingdisk还是在原来的0,1,2号盘上,即没有做 rebalance。 SQL> select number_kfdat from x$kfdat where group_kfdat=4 and fnum_kfdat=1048572 group by number_kfdat; NUMBER_KFDAT ------------ 0 1 2 |
即,在我加盘后,ocr文件通过rebalance已经分布到新盘上,而votedisk不会rebalance,所以还是保留在3个旧盘上。
好了,加完所有的新盘之后,我开始删旧盘:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 |
--fnum=1048572表示votedisk,可以看到,最初的状态,还是在3个旧盘上。 SQL> select number_kfdat from x$kfdat where group_kfdat=5 and fnum_kfdat=1048572 group by number_kfdat 2 / NUMBER_KFDAT ------------ 0 1 2 --fnum=255表示ocr file,可以看到,最初的状态,已经分布到6个盘上。 SQL> select number_kfdat from x$kfdat where group_kfdat=4 and fnum_kfdat=255 group by number_kfdat SQL> / NUMBER_KFDAT ------------ 0 1 2 3 4 5 6 rows selected. --开始删除旧盘,先删除第一个旧盘 SQL> alter diskgroup OCRVOT drop disk OCRVOT_0000; Diskgroup altered. --看到oracle是先自己加了一个votedisk,即disk number=3。在此时,你可以看到有4块盘为votedisk: SQL> select number_kfdat from x$kfdat where group_kfdat=4 and fnum_kfdat=1048572 group by number_kfdat 2 / NUMBER_KFDAT ------------ 0 1 2 3 SQL> / NUMBER_KFDAT ------------ 0 1 2 3 SQL> / NUMBER_KFDAT ------------ 0 1 2 3 SQL> --大约过了1分钟左右,oracle删除了作为votedisk的第一块盘,已经将第一块盘(0号盘)的内容复制到了3号盘。此时votedisk在第1,2,3号盘上。 SQL> / NUMBER_KFDAT ------------ 1 2 3 --注意,votedisk的删盘动作和ocr的rebalance动作其实是两个独立的动作,可以看到,votedisk做完自我复制,删除,之后,去看看ocr,那个时候还在进行rebalance。 SQL> SQL> SQL> select * from v$asm_operation; GROUP_NUMBER OPERA STAT POWER ACTUAL SOFAR EST_WORK EST_RATE EST_MINUTES ERROR_CODE ------------ ----- ---- ---------- ---------- ---------- ---------- ---------- ----------- -------------------------------------------- 4 REBAL WAIT 1 SQL> SQL> --我们再等一会,等ocr rebalance完成: SQL> / no rows selected --继续删除剩下的2块盘 SQL> alter diskgroup OCRVOT drop disk OCRVOT_0001; Diskgroup altered. SQL> alter diskgroup OCRVOT drop disk OCRVOT_0002; Diskgroup altered. SQL> --可以看到,votedisk的处理是一块盘一块盘处理的,即我删掉2块盘,不会同时复制2块盘,再删除2块盘。而是先复制一块盘,再删除一块盘,再复制第二块盘,再删除第二块盘。 SQL> select number_kfdat from x$kfdat where group_kfdat=4 and fnum_kfdat=1048572 group by number_kfdat; NUMBER_KFDAT ------------ 2 3 4 SQL> / NUMBER_KFDAT ------------ 2 3 4 SQL> SQL> SQL> / NUMBER_KFDAT ------------ 2 3 4 5 SQL> / NUMBER_KFDAT ------------ 3 4 5 SQL> --最终完成自我复制和删除,剩下3,4,5号盘。 SQL> SQL> SQL> SQL> --ocr的rebalance还在进行: SQL> select * from v$asm_operation; GROUP_NUMBER OPERA STAT POWER ACTUAL SOFAR EST_WORK EST_RATE EST_MINUTES ERROR_CODE ------------ ----- ---- ---------- ---------- ---------- ---------- ---------- ----------- -------------------------------------------- 4 REBAL RUN 1 1 284 356 421 0 SQL> SQL> / GROUP_NUMBER OPERA STAT POWER ACTUAL SOFAR EST_WORK EST_RATE EST_MINUTES ERROR_CODE ------------ ----- ---- ---------- ---------- ---------- ---------- ---------- ----------- -------------------------------------------- 4 REBAL REAP 1 1 356 356 0 0 SQL> --稍等片刻,完成。 SQL> / no rows selected SQL> SQL> |
这里补充一个知识,就是我们刚刚查fnum为255和1048572,分别表示ocr和votedisk,这是有依据的,参考ASM Metadata and Internals:
Description of metadata files:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
· File#1: File Directory (files and their extent pointers) · File#2: Disk Directory · File#3: Active Change Directory (ACD) The ACD is analogous to a redo log, where changes to the metadata are logged. Size=42MB * number of instances · File#4: Continuing Operation Directory (COD). The COD is analogous to an undo tablespace. It maintains the state of active ASM operations such as disk or datafile drop/add. The COD log record is either committed or rolled back based on the success of the operation. · File#5: Template directory · File#6: Alias directory · File#8: 11g ?? content N.K. · 11g, File#9: Attribute Directory · 11g, File#12: Staleness directory, allocated when needed to track offline disks · 12c, File #13 ASM password directory · 11g, File#253: ASM spfile in ASM (11gR2 feature) · 11g, File#254: Staleness registry, allocated when needed to track offline disks · 11g, File#255: OCR FILE in ASM (11gR2 feature) · 11g, File#1048572 (Hex=FFFFC), special file, does not appear in x$kffxp: it contains the mirrored copies of the voting disk in ASM (11gR2 and 12c), 3 copies for normal redundancy · 11g, File#1048575 (Hex=FFFFF), not a real file#, does not appear in x$kffxp, content N.K., it appears to allocate a relatively small size at the end of each ASM disk. |