_db_block_max_cr_dba 这个隐含参数的作用是控制每个block(即一个dba下,或者说x$bh.dbablk)的最多cr块的个数。默认值是6(5个CR+1个XCUR)。
当产生一致性读(CR)的时候,session会从前镜像读取块,加载到buffer cache中,加载的这个块,我们叫CR copy。
保留多个版本的CR,可以缓解对buffer中block的并发争用(buffer busy wait),避免多个session同时读取一个buffer block。
但是如果版本过多,挂在一个hash chain下的block太多,又会造成CBC latch的争用。所以oracle选择了6个版本。
我们来测试一下CR copy的特性(数据库版本11.2.0.4):
1 2 3 4 5 |
数据初始化: drop table t1 purge; create table t1(c1 int, c2 char(700)); insert into t1 select rownum as c1,'x' as c2 from dual connect by level<=10; commit; |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
SQL> col file# for 999999999 SQL> col block# for 999999999 SQL> select dbms_rowid.rowid_relative_fno(rowid) as file#, 2 dbms_rowid.rowid_block_number(rowid) as block# 3 from t1; FILE# BLOCK# ---------- ---------- 4 1365 4 1365 4 1365 4 1365 4 1365 4 1365 4 1365 4 1365 4 1365 4 1365 10 rows selected. SQL> |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
检查初始情况 SQL> startup force ORACLE instance started. Total System Global Area 4960579584 bytes Fixed Size 2289720 bytes Variable Size 989859784 bytes Database Buffers 3959422976 bytes Redo Buffers 9007104 bytes Database mounted. Database opened. SQL> SQL> SQL> select b.dbarfil, b.dbablk, b.class,tch,flag, 2 decode(state,0,'free',1,'xcur',2,'scur',3,'cr', 4,'read',5,'mrec',6,'irec',7,'write',8,'pi',9,'memory',10,'mwrite',11,'donated') as state, 3 decode(bitand(flag,1),0,'N','Y') dirty, decode(bitand(flag,16),0,'N','Y') temp, decode(bitand(flag,1536),0,'N','Y') ping, 4 decode(bitand(flag,16384),0,'N','Y') stale, decode(bitand(flag,65536),0,'N','Y') direct, decode(bitand(flag,1048576),0,'N','Y') new , 5 cr_scn_bas, cr_scn_wrp, cr_uba_fil, cr_uba_blk, cr_uba_seq, 6 (select object_name from dba_objects where object_id = b.obj) as object_name 7 from x$bh b 8 where 9 dbarfil = 4 and dbablk = 1365; no rows selected SQL> |
刚刚启动的时候,我们发现file#=4,block#=1365没有在buffer中。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
开始构造一致性读: session 1: SQL> SELECT dbms_flashback.get_system_change_number from dual; GET_SYSTEM_CHANGE_NUMBER ------------------------ 2687527 SQL> update t1 set c2='y' where c1=1; 1 row updated. SQL> SELECT dbms_flashback.get_system_change_number from dual; GET_SYSTEM_CHANGE_NUMBER ------------------------ 2687528 SQL> session 2: SQL> l 1 select b.dbarfil, b.dbablk, b.class,tch,flag, 2 decode(state,0,'free',1,'xcur',2,'scur',3,'cr', 4,'read',5,'mrec',6,'irec',7,'write',8,'pi',9,'memory',10,'mwrite',11,'donated') as state, 3 decode(bitand(flag,1),0,'N','Y') dirty, decode(bitand(flag,16),0,'N','Y') temp, decode(bitand(flag,1536),0,'N','Y') ping, 4 decode(bitand(flag,16384),0,'N','Y') stale, decode(bitand(flag,65536),0,'N','Y') direct, decode(bitand(flag,1048576),0,'N','Y') new , 5 cr_scn_bas, cr_scn_wrp, cr_uba_fil, cr_uba_blk, cr_uba_seq, 6 (select object_name from dba_objects where object_id = b.obj) as object_name 7 from x$bh b 8 where 9* dbarfil = 4 and dbablk = 1365 SQL> / DBARFIL DBABLK CLASS TCH FLAG STATE DI TE PI ST DI NE CR_SCN_BAS CR_SCN_WRP CR_UBA_FIL CR_UBA_BLK CR_UBA_SEQ OBJECT_NAME ---------- ---------- ---------- ---------- ---------- -------------- -- -- -- -- -- -- ---------- ---------- ---------- ---------- ---------- ---------------- 4 1365 1 1 33554433 xcur Y N N N N N 0 0 0 0 0 T1 4 1365 1 1 524288 cr N N N N N N 2687527 0 0 0 0 T1 SQL> |
当开始DML之后,block中数据文件中加载到buffer中,此时一定会有一个xcur的block,和cr block。这是因为DML is always performed on the current copy of the block (Status=xcur).
12345678 Whenever an update is made to the block,If the current block is in the buffer (status = xcur)A clone of the current block is made (status=cr)Update is performed on the current block (status=xcur)Else (the desired block is not in the buffer)The block is read from the disk (status=xcur)A clone of the block is also made (status=cr)Update is performed on the current block (status=xcur)
xcur的block表示当前已经被修改过的block,是最新的block,注意DI列(Dirty列)已经是Y,表示这个block是buffer中的脏块。
cr block是在update之前,在内存中copy原来的xcur的block。我们看cr block的CR_SCN_BASE是update前一瞬间的SCN,即2687527。
12345678 即当update发生的时候,如果块在buffer中(块的状态为xcur),那么copy一份这个在buffer中的xcur的块,copy出来的块是cr块。更新这个块,且这个块的标记还是xcur。如果在buffer中没有这个块那么将这个block从磁盘读到buffer中(此时状态为xcur。这个过程,类似进行了一次没有一致性读的select,select的时候,当前块状态也是为xcur)将buffer中的xcur块,做一份copy,copy出来的块成为cr块更新这个块,且这个块的状态还是xcur
====================
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
由于刚刚进行了update,且没有commit。所以现在的select是需要进行一致性读。 session 3: SQL> select * from t1 where c1=1; session2: SQL> SQL> l 1 select b.dbarfil, b.dbablk, b.class,tch,flag, 2 decode(state,0,'free',1,'xcur',2,'scur',3,'cr', 4,'read',5,'mrec',6,'irec',7,'write',8,'pi',9,'memory',10,'mwrite',11,'donated') as state, 3 decode(bitand(flag,1),0,'N','Y') dirty, decode(bitand(flag,16),0,'N','Y') temp, decode(bitand(flag,1536),0,'N','Y') ping, 4 decode(bitand(flag,16384),0,'N','Y') stale, decode(bitand(flag,65536),0,'N','Y') direct, decode(bitand(flag,1048576),0,'N','Y') new , 5 cr_scn_bas, cr_scn_wrp, cr_uba_fil, cr_uba_blk, cr_uba_seq, 6 (select object_name from dba_objects where object_id = b.obj) as object_name 7 from x$bh b 8 where 9* dbarfil = 4 and dbablk = 1365 SQL> / DBARFIL DBABLK CLASS TCH FLAG STATE DI TE PI ST DI NE CR_SCN_BAS CR_SCN_WRP CR_UBA_FIL CR_UBA_BLK CR_UBA_SEQ OBJECT_NAME ---------- ---------- ---------- ---------- ---------- -------------- -- -- -- -- -- -- ---------- ---------- ---------- ---------- ---------- --------------- 4 1365 1 1 524288 cr N N N N N N 2687546 0 3 317 980 T1 4 1365 1 1 524288 cr N N N N N N 2687545 0 3 317 980 T1 4 1365 1 1 33554433 xcur Y N N N N N 0 0 0 0 0 T1 <<原来的xcur block 4 1365 1 1 524288 cr N N N N N N 2687527 0 0 0 0 T1 << 原来的cr block SQL> SQL> select tablespace_name from dba_data_files where file_id=3; TABLESPACE_NAME ------------------------------------------------------------ UNDOTBS1 SQL> |
第一次的select,oracle一次性创建了2个CR block,分别是在SCN 2687545和SCN 2687546的时候。
这个时候,由于是需要一致性读,因此这次的select是从前镜像读取,从undo中读取,所以,也可以看到这个前镜像块是从undo的那个块上读取,可以看到有UBA(undo block address)的file id,block id和sequence。file id为3,是在undo tablespace上。
===============
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
session 4: SQL> select * from t1 where c1=1; session 2: SQL> l 1 select b.dbarfil, b.dbablk, b.class,tch,flag, 2 decode(state,0,'free',1,'xcur',2,'scur',3,'cr', 4,'read',5,'mrec',6,'irec',7,'write',8,'pi',9,'memory',10,'mwrite',11,'donated') as state, 3 decode(bitand(flag,1),0,'N','Y') dirty, decode(bitand(flag,16),0,'N','Y') temp, decode(bitand(flag,1536),0,'N','Y') ping, 4 decode(bitand(flag,16384),0,'N','Y') stale, decode(bitand(flag,65536),0,'N','Y') direct, decode(bitand(flag,1048576),0,'N','Y') new , 5 cr_scn_bas, cr_scn_wrp, cr_uba_fil, cr_uba_blk, cr_uba_seq, 6 (select object_name from dba_objects where object_id = b.obj) as object_name 7 from x$bh b 8 where 9* dbarfil = 4 and dbablk = 1365 SQL> / DBARFIL DBABLK CLASS TCH FLAG STATE DI TE PI ST DI NE CR_SCN_BAS CR_SCN_WRP CR_UBA_FIL CR_UBA_BLK CR_UBA_SEQ OBJECT_NAME ---------- ---------- ---------- ---------- ---------- -------------- -- -- -- -- -- -- ---------- ---------- ---------- ---------- ---------- ----------------- 4 1365 1 1 524288 cr N N N N N N 2687575 0 3 317 980 T1 4 1365 1 1 524288 cr N N N N N N 2687546 0 3 317 980 T1 << 第一次select的cr block 4 1365 1 1 524288 cr N N N N N N 2687545 0 3 317 980 T1 << 第一次select的cr block 4 1365 1 1 33554433 xcur Y N N N N N 0 0 0 0 0 T1 <<原来的xcur block 4 1365 1 1 524288 cr N N N N N N 2687527 0 0 0 0 T1 <<原来的cr block SQL> |
第二次的select,oracle只创建了1个CR block,分别是在SCN 2687575的时候。 读取的是在undo上的同一个UBA file id,block id和sequence,所以,在undo文件上的块是同一个块,但是在buffer中,cr块目前已经有3个cr块了。
另外还有一个cr块,但是这个块是在做update的时候,对于该session来说在update之前的xcur的copy。不是从undo文件中读取的block。
===============
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
session 5: SQL> select * from t1 where c1=1; session 2: SQL> l 1 select b.dbarfil, b.dbablk, b.class,tch,flag, 2 decode(state,0,'free',1,'xcur',2,'scur',3,'cr', 4,'read',5,'mrec',6,'irec',7,'write',8,'pi',9,'memory',10,'mwrite',11,'donated') as state, 3 decode(bitand(flag,1),0,'N','Y') dirty, decode(bitand(flag,16),0,'N','Y') temp, decode(bitand(flag,1536),0,'N','Y') ping, 4 decode(bitand(flag,16384),0,'N','Y') stale, decode(bitand(flag,65536),0,'N','Y') direct, decode(bitand(flag,1048576),0,'N','Y') new , 5 cr_scn_bas, cr_scn_wrp, cr_uba_fil, cr_uba_blk, cr_uba_seq, 6 (select object_name from dba_objects where object_id = b.obj) as object_name 7 from x$bh b 8 where 9* dbarfil = 4 and dbablk = 1365 SQL> / DBARFIL DBABLK CLASS TCH FLAG STATE DI TE PI ST DI NE CR_SCN_BAS CR_SCN_WRP CR_UBA_FIL CR_UBA_BLK CR_UBA_SEQ OBJECT_NAME ---------- ---------- ---------- ---------- ---------- -------------- -- -- -- -- -- -- ---------- ---------- ---------- ---------- ---------- ------------------- 4 1365 1 1 524288 cr N N N N N N 2687595 0 3 317 980 T1 4 1365 1 1 524288 cr N N N N N N 2687575 0 3 317 980 T1 << 第二次select的cr block 4 1365 1 1 524288 cr N N N N N N 2687546 0 3 317 980 T1 << 第一次select的cr block 4 1365 1 1 524288 cr N N N N N N 2687545 0 3 317 980 T1 << 第一次select的cr block 4 1365 1 1 33554433 xcur Y N N N N N 0 0 0 0 0 T1 <<原来的xcur block SQL> |
第三次的select,oracle只创建了1个CR block,分别是在SCN 2687595的时候。 读取的是在undo上的同一个UBA file id,block id和sequence,所以,在undo文件上的块是同一个块,但是在buffer中,cr块目前已经有4个cr块了。
另外我们注意到,oracle已经把第一次做update的时候,SCN 2687527的cr块刷出去了。这个被丢弃的CR block,对我做update的session来说,已经没有用处,因为当前session的block的值已经更新,当前session所需要的block是xcur的那个block。而对已其他session来说,由于还没有commit,需要读取前镜像,可以直接做第一次和第二次cr block的copy;或者直接从undo中加载。
===============
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
session 6: SQL> select * from t1 where c1=1; session 2: SQL> l 1 select b.dbarfil, b.dbablk, b.class,tch,flag, 2 decode(state,0,'free',1,'xcur',2,'scur',3,'cr', 4,'read',5,'mrec',6,'irec',7,'write',8,'pi',9,'memory',10,'mwrite',11,'donated') as state, 3 decode(bitand(flag,1),0,'N','Y') dirty, decode(bitand(flag,16),0,'N','Y') temp, decode(bitand(flag,1536),0,'N','Y') ping, 4 decode(bitand(flag,16384),0,'N','Y') stale, decode(bitand(flag,65536),0,'N','Y') direct, decode(bitand(flag,1048576),0,'N','Y') new , 5 cr_scn_bas, cr_scn_wrp, cr_uba_fil, cr_uba_blk, cr_uba_seq, 6 (select object_name from dba_objects where object_id = b.obj) as object_name 7 from x$bh b 8 where 9* dbarfil = 4 and dbablk = 1365 SQL> / DBARFIL DBABLK CLASS TCH FLAG STATE DI TE PI ST DI NE CR_SCN_BAS CR_SCN_WRP CR_UBA_FIL CR_UBA_BLK CR_UBA_SEQ OBJECT_NAME ---------- ---------- ---------- ---------- ---------- -------------- -- -- -- -- -- -- ---------- ---------- ---------- ---------- ---------- ------------------- 4 1365 1 1 524288 cr N N N N N N 2687615 0 3 317 980 T1 4 1365 1 1 524288 cr N N N N N N 2687595 0 3 317 980 T1 << 第三次select的cr block 4 1365 1 1 524288 cr N N N N N N 2687575 0 3 317 980 T1 << 第二次select的cr block 4 1365 1 1 524288 cr N N N N N N 2687546 0 3 317 980 T1 << 第一次select的cr block 4 1365 1 1 524288 cr N N N N N N 2687545 0 3 317 980 T1 << 第一次select的cr block 4 1365 1 1 33554433 xcur Y N N N N N 0 0 0 0 0 T1 <<原来的xcur block 6 rows selected. SQL> |
第四次的select,oracle只创建了1个CR block,分别是在SCN 2687615的时候。 读取的是在undo上的同一个UBA file id,block id和sequence,所以,在undo文件上的块是同一个块,但是在buffer中,cr块目前已经有5个cr块了。
===============
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
session 7: SQL> select * from t1 where c1=1; session 2: SQL> l 1 select b.dbarfil, b.dbablk, b.class,tch,flag, 2 decode(state,0,'free',1,'xcur',2,'scur',3,'cr', 4,'read',5,'mrec',6,'irec',7,'write',8,'pi',9,'memory',10,'mwrite',11,'donated') as state, 3 decode(bitand(flag,1),0,'N','Y') dirty, decode(bitand(flag,16),0,'N','Y') temp, decode(bitand(flag,1536),0,'N','Y') ping, 4 decode(bitand(flag,16384),0,'N','Y') stale, decode(bitand(flag,65536),0,'N','Y') direct, decode(bitand(flag,1048576),0,'N','Y') new , 5 cr_scn_bas, cr_scn_wrp, cr_uba_fil, cr_uba_blk, cr_uba_seq, 6 (select object_name from dba_objects where object_id = b.obj) as object_name 7 from x$bh b 8 where 9* dbarfil = 4 and dbablk = 1365 SQL> / DBARFIL DBABLK CLASS TCH FLAG STATE DI TE PI ST DI NE CR_SCN_BAS CR_SCN_WRP CR_UBA_FIL CR_UBA_BLK CR_UBA_SEQ OBJECT_NAME ---------- ---------- ---------- ---------- ---------- -------------- -- -- -- -- -- -- ---------- ---------- ---------- ---------- ---------- ------------------- 4 1365 1 1 524288 cr N N N N N N 2687643 0 3 317 980 T1 4 1365 1 1 524288 cr N N N N N N 2687615 0 3 317 980 T1 4 1365 1 1 524288 cr N N N N N N 2687595 0 3 317 980 T1 4 1365 1 1 524288 cr N N N N N N 2687575 0 3 317 980 T1 4 1365 1 1 524288 cr N N N N N N 2687546 0 3 317 980 T1 4 1365 1 1 33554433 xcur Y N N N N N 0 0 0 0 0 T1 6 rows selected. SQL> |
第五次的select,oracle只创建了1个CR block,分别是在SCN 2687643的时候。 创建这个CR block的时候,把当前cr block列表中最早的SCN 2687545的block丢弃了。只保留5个CR block。
至此,我们看到,不同的session的select,对cr block的影响:
update但不commit,在update时产生一个cr block。
第一次select从undo获得前镜像产生2个cr block,当前共3个cr block
第二次select再产生一个cr block,当前共4个cr block
第三次select再产生一个cr block,且丢弃update时的cr block,当前共4个cr block
第四次select再生产一个cr block,至此已经有了5个cr block和1个xcur block
第五次select再产生一个cr block,丢弃第一次select产生的第一个block,只保留5个cr block和1一个xcur block
再后续select的话,每产生一个cr block,丢弃最早的cr block
===============
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 |
我们再来看看flush buffer cache的影响: session 2: SQL> alter system flush buffer_cache; System altered. SQL> select b.dbarfil, b.dbablk, b.class,tch,flag, 2 decode(state,0,'free',1,'xcur',2,'scur',3,'cr', 4,'read',5,'mrec',6,'irec',7,'write',8,'pi',9,'memory',10,'mwrite',11,'donated') as state, 3 decode(bitand(flag,1),0,'N','Y') dirty, decode(bitand(flag,16),0,'N','Y') temp, decode(bitand(flag,1536),0,'N','Y') ping, 4 decode(bitand(flag,16384),0,'N','Y') stale, decode(bitand(flag,65536),0,'N','Y') direct, decode(bitand(flag,1048576),0,'N','Y') new , 5 cr_scn_bas, cr_scn_wrp, cr_uba_fil, cr_uba_blk, cr_uba_seq, 6 (select object_name from dba_objects where object_id = b.obj) as object_name 7 from x$bh b 8 where 9 dbarfil = 4 and dbablk = 1365; DBARFIL DBABLK CLASS TCH FLAG STATE DI TE PI ST DI NE CR_SCN_BAS CR_SCN_WRP CR_UBA_FIL CR_UBA_BLK CR_UBA_SEQ OBJECT_NAME ---------- ---------- ---------- ---------- ---------- -------------- -- -- -- -- -- -- ---------- ---------- ---------- ---------- ---------- ------------------ 4 1365 1 0 0 free N N N N N N 0 0 0 0 0 T1 4 1365 1 0 0 free N N N N N N 0 0 0 0 0 T1 4 1365 1 0 0 free N N N N N N 0 0 0 0 0 T1 4 1365 1 0 0 free N N N N N N 0 0 0 0 0 T1 4 1365 1 0 0 free N N N N N N 0 0 0 0 0 T1 4 1365 1 0 0 free N N N N N N 0 0 0 0 0 T1 6 rows selected. SQL> session 8: SQL> select * from t1 where c1=1; session 2: SQL> l 1 select b.dbarfil, b.dbablk, b.class,tch,flag, 2 decode(state,0,'free',1,'xcur',2,'scur',3,'cr', 4,'read',5,'mrec',6,'irec',7,'write',8,'pi',9,'memory',10,'mwrite',11,'donated') as state, 3 decode(bitand(flag,1),0,'N','Y') dirty, decode(bitand(flag,16),0,'N','Y') temp, decode(bitand(flag,1536),0,'N','Y') ping, 4 decode(bitand(flag,16384),0,'N','Y') stale, decode(bitand(flag,65536),0,'N','Y') direct, decode(bitand(flag,1048576),0,'N','Y') new , 5 cr_scn_bas, cr_scn_wrp, cr_uba_fil, cr_uba_blk, cr_uba_seq, 6 (select object_name from dba_objects where object_id = b.obj) as object_name 7 from x$bh b 8 where 9* dbarfil = 4 and dbablk = 1365 SQL> / DBARFIL DBABLK CLASS TCH FLAG STATE DI TE PI ST DI NE CR_SCN_BAS CR_SCN_WRP CR_UBA_FIL CR_UBA_BLK CR_UBA_SEQ OBJECT_NAME ---------- ---------- ---------- ---------- ---------- -------------- -- -- -- -- -- -- ---------- ---------- ---------- ---------- ---------- ----------------- 4 1365 1 1 524288 cr N N N N N N 2687685 0 3 317 980 T1 4 1365 1 0 34078721 xcur Y N N N N N 0 0 0 0 0 T1 4 1365 1 0 0 free N N N N N N 0 0 0 0 0 T1 4 1365 1 0 0 free N N N N N N 0 0 0 0 0 T1 4 1365 1 0 0 free N N N N N N 0 0 0 0 0 T1 4 1365 1 0 0 free N N N N N N 0 0 0 0 0 T1 4 1365 1 0 0 free N N N N N N 0 0 0 0 0 T1 4 1365 1 0 0 free N N N N N N 0 0 0 0 0 T1 8 rows selected. SQL> |
我们看到,flush buffer cache之后,xcur的block,即标记为dirty的block也被刷出buffer cache,所有的buffer block都是显示free。
但进行第一次select的时候,被修改的xcur block,还是从db file加载到内存,且被记录成dirty的block。另外,select出来的前镜像,也从undo加载到内存,形成第一个cr block。
到了这个,顺便问个问题,现在都流行database in memory,如果我的内存是256G的,能放得下256G的database吗?读了上面的问题,相信你已经有了初步的答案。:)