在阿里云上创建dataguard+FSFO

在阿里云上创建rac或者rac onenode,共享存储比较麻烦,需要配iSCSI,我们可以用dataguard+fsfo架构来类似的实现高可用方案。

本文介绍了在阿里云上oracle 12.1.0.2 dataguard+fsfo的搭建过程。

(一)、基础工作序列:

  阿里云上创建Oracle Dataguard+FSFO 负责人
1 主机分配 云管理员
2 操作系统设置 云管理员、DBA
3 数据库安装 DBA
4 数据库配置 DBA
5 数据库高可用场景测试 DBA
6 数据库备份设置 DBA

 

(二)、操作系统设置:

需求项 操作环境 需求细节内容 说明
操作系统要求 primary instance host/standby instance host/observer host linux版本:SLES 11 阿里云上经过oracle certification的操作系统,只有sles。
阿里云上没有rhel和oel
文件系统要求 primary instance host/standby instance host/observer host 文件系统:EXT3 sles 11不支持ext4,ext格式化后的文件系统只能以read only的方式mount上去,所以选择ext3
RPM包要求 primary instance host/standby instance host/observer host 安装完成后,检查如下的rpm(或更高版本)是否已经安装:
# For SUSE 11
binutils-2.21.1-0.7.25
gcc-4.3-62.198
gcc-c++-4.3-62.198
glibc-2.11.3-17.31.1
glibc-devel-2.11.3-17.31.1
ksh-93u-0.6.1
libaio-0.3.109-0.1.46
libaio-devel-0.3.109-0.1.46
libcap1-1.10-6.10
libstdc++33-3.3.3-11.9
libstdc++33-32bit-3.3.3-11.9
libstdc++43-devel-4.3.4_20091019-0.22.17
libstdc++46-4.6.1_20110701-0.13.9
libgcc46-4.6.1_20110701-0.13.9
make-3.81
sysstat-8.1.5-7.32.1
xorg-x11-libs-32bit-7.4
xorg-x11-libs-7.4
xorg-x11-libX11-32bit-7.4
xorg-x11-libX11-7.4
xorg-x11-libXau-32bit-7.4
xorg-x11-libXau-7.4
xorg-x11-libxcb-32bit-7.4
xorg-x11-libxcb-7.4
xorg-x11-libXext-32bit-7.4
xorg-x11-libXext-7.4
SLES自带的rpm包中已经包含了大部分的rpm,只有如下几个rpm包尚未包含。需要在suse linux 11的安装ios镜像中提取。然后安装。
或者使用右侧已经提取的压缩包。# rpm -ivh libaio-devel-0.3.109-0.1.46.x86_64.rpm
# rpm -ivh libstdc++43-devel-4.3.4_20091019-0.37.30.x86_64.rpm
# rpm -ivh libstdc++-devel-4.3-62.200.2.x86_64.rpm
# rpm -ivh gcc43-c++-4.3.4_20091019-0.37.30.x86_64.rpm
# rpm -ivh gcc-c++-4.3-62.200.2.x86_64.rpm
X图形界面要求 primary instance host/standby instance host/observer host 1. 配置vnc,vnc的配置文件
#!/bin/shxrdb $HOME/.Xresources
xsetroot -solid grey
xterm -geometry 80×24+10+10 -ls -title “$VNCDESKTOP Desktop” &
#twm &
startgnome &
DISPLAY=:1 gnome-session &2. 启动gdm:
gdm restart&

3. 启动vnc:
vncserver :1

要求能启动图形化界面
SWAP大小要求 primary instance host/standby instance host/observer host 内存大小为2GB~16GB,swap大小要求设置成和内存大小一样。
内存大于16GB,swap大小设置成16GB检查方法:grep SwapTotal /proc/meminfo创建方法:
# dd if=/dev/zero of=/root/swapfile bs=1M count=16384 #count为具体的大小
# mkswap /root/swapfile
# swapon /root/swapfile #启用swap文件
在文件/etc/fstab中添加一行:
/root/swapfile swap swap defaults 0 0
SWAP要求不符合,将会导致安装时ORA-4030报错,随后数据库crash。
关闭透明大页 primary instance host/standby instance host/observer host 检查:
cat /sys/kernel/mm/transparent_hugepage/enabled –应该看到[never]如果没关闭,需要在SLES的图形化界面中启动computer,启动YaST2,在BootLoader中的Optional Kernel command line Parameter中,加上transparent_hugepage=never见右图
修改IO调度队列 primary instance host/standby instance host 1. 如果使用的是SSD,需要将IO调度队列改成noop,cat /sys/block/<sda>/queue/scheduler 显示为noop

如果没显示为noop,需要在SLES的图形化界面中启动computer,启动YaST2,在BootLoader中的Optional Kernel command line Parameter中,加上elevator=noop (类似上面的修改)

2. 如果是高效云盘,不是SSD,则无需修改IO调度队列。

建卷要求 primary instance host/standby instance host/ 1.软件卷:/u01,大小最小50GB,用于存放Oracle软件,以及产生的数据库日志和trace文件,属主为oracle:oinstall
2.数据卷:/data01,大小最小为200G,用于存放数据库的数据文件,属主为oracle:oinstall
3.归档卷:/fra,大小最小为200G,用于存放数据库的归档日志和备份片,属主为oracle:oinstall
软件卷可以放在本地硬盘
数据卷和归档卷需要SAN存储
数据卷可根据生产数据增长量,增加/data02,/data03……或者扩展/data01卷
归档卷不能增加新卷,只能扩展归档卷
observer host 1.软件卷:/u01,大小最小50GB,用于存放Oracle软件,以及产生的数据库日志和trace文件,属主为oracle:oinstall
不需要数据卷和归档卷
软件卷可以放在本地硬盘
数据卷和归档卷需要SAN存储
数据卷可根据生产数据增长量,增加/data02,/data03……或者扩展/data01卷
归档卷不能增加新卷,只能扩展归档卷
防火墙要求 primary instance host/standby instance host/observer host 数据库主库和dataguard灾备库,以及仲裁库,需要双向打通1521端口
os group primary instance host/standby instance host/observer host groupadd -g 501 oinstall
groupadd -g 502 dba
groupadd -g 503 oper
Osuser primary instance host/standby instance host/observer host useradd -u 500 -g oinstall -G dba,oper oracle 生产环境oracle操作系统用户统一为oracle,
开发环境oracle操作系统用户统一为oradev
测试环境oracle操作系统用户统一为orauat
设置kernel参数 primary instance host/standby instance host/observer host vi /etc/sysctl.conf
## add for install oracle
fs.aio-max-nr = 1048576
fs.file-max = 6815744
kernel.shmall = 2097152
kernel.shmmax = 15294967295
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
net.ipv4.ip_local_port_range = 9000 65500
net.core.rmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 1048576
注:kernel.shmmax设置成接近物理内存
设置hosts primary instance host/standby instance host/observer host # on dataguardfsfohost003
127.0.0.1       dataguardfsfohost003
192.168.56.1   dataguardfsfohost003
192.168.56.2   dataguardfsfohost001
192.168.56.3   dataguardfsfohost002# on dataguardfsfohost001
127.0.0.1       dataguardfsfohost001
192.168.56.1   dataguardfsfohost003
192.168.56.2   dataguardfsfohost001
192.168.56.3   dataguardfsfohost002# on dataguardfsfohost002
127.0.0.1      dataguardfsfohost002
192.168.56.1   dataguardfsfohost003
192.168.56.2   dataguardfsfohost001
192.168.56.3   dataguardfsfohost002

 

(三)数据库软件安装和实例安装

需求项 操作环境 需求细节内容 说明
数据库实例名要求 primary instance host/standby instance host 应用ORACLE_SID 以所属系统来命名,长度不能超过6位,如果超过6位,需要进行缩写。 命名以英文单词组成,不允许使用汉语拼音。
命名中不允许出现ORACLE数据库中的关键字(Reserved Word)。
如:user、remark等单词不允许使用。Oracle关键字可从v$reserved_words中查询。超过6位的ORACLE_SID使用缩写,缩写规则如下:1.包含四个或四个以下字符的单词,可以不必缩写;
2.所有的缩写必须唯一,不能有多义;
3.一个单词的缩写必须以该单词的首字母开头;
4.缩写中不能包含数字或特殊字符;
5.缩写应当从命名右侧的单词开始,自右向左地进行缩写,直到满足所需的长度为止。
软件目录要求 primary instance host/standby instance host ORACLE_BASE: /u01/<PRD>/oracle
ORACLE_HOME: /u01/<PRD>/oracle/<db_version_4_digit>/<ORACLE_SID>
需要替换部分:
<PRD>:生产环境统一为PRD,开发环境统一为DEV,测试环境统一为UAT
<db_version_4_digit>为4位oracle的版本,如11.2.0.4,如12.1.0.2
<ORACLE_SID>为数据库实例名
数据文件存储要求 primary instance host/standby instance host 数据文件:
/data01/<PRD>/oracle/<db_version_4_digit>/<ORACLE_SID>/dfile日志文件:
/data01/<PRD>/oracle/<db_version_4_digit>/<ORACLE_SID>/lfile控制文件:
/data01/<PRD>/oracle/<db_version_4_digit>/<ORACLE_SID>/cfile
需要替换部分:
<PRD>:生产环境统一为PRD,开发环境统一为DEV,测试环境统一为UAT
<db_version_4_digit>为4位oracle的版本,如11.2.0.4,如12.1.0.2
<ORACLE_SID>为数据库实例名
数据库版本 primary instance host/standby instance host PatchSet:12.1.0.2
PSU:当年最新一个PSU的上一个PSU
Oneoff:根据补丁分析列表,打上对应的oneoff补丁注:Dataguard备库的数据库软件版本,必须严格和主库一致
Oneoff补丁,以Patch_Assessment_on_top_of_xxxxxxx为准
数据库用户环境变量要求 primary instance host/standby instance host 数据库用户的环境变量(bash_profile文件)中至少应该包含如下:
export ORACLE_SID=<ORACLE_SID>
export ORACLE_BASE=/u01/<PRD>/oracle
export ORACLE_HOME=$ORACLE_BASE/<db_version_4_digit>/<ORACLE_SID>
export PATH=$ORACLE_HOME/bin:$PATH:$HOME/bin
PATH=$PATH:$HOME/bin
需要替换部分:
<PRD>:生产环境统
一为PRD,开发环境统一为DEV,测试环境统一为UAT
<db_version_4_digit>为4位oracle的版本,如11.2.0.4,如12.1.0.2.
<ORACLE_SID>为数据库实例名
安装数据库软件 primary instance host/standby instance host 运行runInstaller.sh安装数据库,见下图 只进行软件安装,数据库创建待安装完成后,另外运行dbca进行创建。
安装数据库实例 primary instance host 运行dbca安装实例,见下图 只需在主库安装实例,备库不需要安装实例

 

(四)数据库初始化参数调整。
【关于11g和12c数据库初始化参数的一些最佳实践参考】
注:_ktb_debug_flags参数和db_ultra_safe 有冲突,在同时设置的情况下,在dataguard switchover过去,再switchover回来的时候,会报错ORA-00600: internal error code, arguments: [kcbz_zib_simulation_1], [29064], [64272],[], [], [], [], [], [], [], [], []

 

(五)Dataguard和FSFO创建:

需求项 操作环境 需求细节内容 说明
调整redo大小 primary instance host alter database drop logfile group 1;
alter database drop logfile group 2;
alter database drop logfile group 3;alter database add logfile group 1(‘/data01/PRD/oracle/12.1.0.2/myappn/dfile/myappn/redo01_01.log’,’/fra/PRD/oracle/12.1.0.2/myappn/myappn/redo01_02.log’) size 300m;
alter database add logfile group 2(‘/data01/PRD/oracle/12.1.0.2/myappn/dfile/myappn/redo02_01.log’,’/fra/PRD/oracle/12.1.0.2/myappn/myappn/redo02_02.log’) size 300m;
alter database add logfile group 3(‘/data01/PRD/oracle/12.1.0.2/myappn/dfile/myappn/redo03_01.log’,’/fra/PRD/oracle/12.1.0.2/myappn/myappn/redo03_02.log’) size 300m;
拷贝密码文件到备库 primary instance host scp $ORACLE_HOME/dbs/orapwmyappn oracle@192.168.56.3:/u01/PRD/oracle/12.1.0.2/myappn/dbs/
主备库创建各个目录 primary instance host/standby instance host mkdir -p /fra/PRD/oracle/12.1.0.2/myappn
mkdir -p /data01/PRD/oracle/12.1.0.2/myappn/dfile/myappn/
mkdir -p /data01/PRD/oracle/12.1.0.2/myappn/dfile/myappn/
mkdir -p /u01/PRD/oracle
mkdir -p /u01/PRD/oracle/admin/myappn/adump
创建备库初始化文件 standby instance host *._high_priority_processes=’LMS*|VKTM|LGWR’
*._kks_obsolete_dump_threshold=0
*._ktb_debug_flags=8
*._kttext_warning=1
*._optimizer_aggr_groupby_elim=FALSE
*._optimizer_dsdir_usage_control=0
*._optimizer_reduce_groupby_key=FALSE
*._optimizer_unnest_scalar_sq=FALSE
*._serial_direct_read=’NEVER’
*._sql_plan_directive_mgmt_control=0
*._undo_autotune=FALSE
*._use_adaptive_log_file_sync=’FALSE’
*.archive_lag_target=1800
*.audit_file_dest=’/u01/PRD/oracle/admin/myappn/adump’
*.audit_trail=’db’
*.cell_offload_processing=FALSE
*.compatible=’12.1.0.2.0′
*.control_files=’/data01/PRD/oracle/12.1.0.2/myappn/dfile/myappn/control01.ctl’,’/fra/PRD/oracle/12.1.0.2/myappn/myappn/control02.ctl’
*.db_block_size=8192
*.db_domain=”
*.db_files=800
*.db_name=’myappn’
*.db_recovery_file_dest=’/fra/PRD/oracle/12.1.0.2/myappn’
*.db_recovery_file_dest_size=161061273600
*.db_securefile=’ALWAYS’
*.deferred_segment_creation=FALSE
*.diagnostic_dest=’/u01/PRD/oracle’
*.dispatchers='(PROTOCOL=TCP) (SERVICE=myappnXDB)’
*.enable_ddl_logging=TRUE
*.event=’28401 TRACE NAME CONTEXT FOREVER, LEVEL 1′
*.job_queue_processes=32
#*.local_listener=’LISTENER_myappn’
*.log_archive_format=’myappn_%R_%T_%S.arc’
*.open_cursors=300
*.parallel_max_servers=128
*.pga_aggregate_target=2560m
*.processes=300
*.recyclebin=’OFF’
*.remote_login_passwordfile=’EXCLUSIVE’
*.session_cached_cursors=200
*.sga_target=7680m
*.undo_tablespace=’UNDOTBS1′
*.utl_file_dir=’?/dbs’
## add for dataguard
*.log_archive_dest_1=’LOCATION=USE_DB_RECOVERY_FILE_DEST VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=myappn_dg’
*.db_unique_name=’myappn_dg’
*.log_archive_dest_state_1=’ENABLE’
*.log_archive_dest_state_2=’DEFER’
在主库创建standby redo log primary instance host alter database add standby logfile group 6(‘/data01/PRD/oracle/12.1.0.2/myappn/dfile/myappn/stby_redo01_01.log’ ,’/fra/PRD/oracle/12.1.0.2/myappn/myappn/stby_redo01_02.log’) size 300m;
alter database add standby logfile group 7(‘/data01/PRD/oracle/12.1.0.2/myappn/dfile/myappn/stby_redo02_01.log’ ,’/fra/PRD/oracle/12.1.0.2/myappn/myappn/stby_redo02_02.log’) size 300m;
alter database add standby logfile group 8(‘/data01/PRD/oracle/12.1.0.2/myappn/dfile/myappn/stby_redo03_01.log’ ,’/fra/PRD/oracle/12.1.0.2/myappn/myappn/stby_redo03_02.log’) size 300m;
alter database add standby logfile group 9(‘/data01/PRD/oracle/12.1.0.2/myappn/dfile/myappn/stby_redo04_01.log’ ,’/fra/PRD/oracle/12.1.0.2/myappn/myappn/stby_redo04_02.log’) size 300m;
修改主库的初始化文件 primary instance host alter database force logging;
alter system set LOG_ARCHIVE_DEST_STATE_1=ENABLE;
alter system set LOG_ARCHIVE_DEST_STATE_2=defer;
alter system set LOG_ARCHIVE_CONFIG=’DG_CONFIG=(myappn,myappn_dg)’;
alter system set LOG_ARCHIVE_DEST_1=’LOCATION=USE_DB_RECOVERY_FILE_DEST VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=myappn’;
alter system set LOG_ARCHIVE_DEST_2=’SERVICE=myappn_dg lgwr ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=myappn_dg’;alter system set standby_file_management=auto scope=spfile;
alter system set fal_server=’myappn_dg’;
alter system set fal_client=’myappn’;alter system set db_create_file_dest=’/data01/PRD/oracle/12.1.0.2/myappn/dfile/myappn/’ scope=spfile;
设置主库的listener.ora文件 primary instance host cat listener.ora

SID_LIST_LISTENER =
(SID_LIST =
(SID_DESC =
(GLOBAL_DBNAME = myappn)
(ORACLE_HOME = /u01/PRD/oracle/12.1.0.2/myappn)
(SID_NAME = myappn)
)
(SID_DESC =
(GLOBAL_DBNAME = myappn_dgmgrl)
(ORACLE_HOME = /u01/PRD/oracle/12.1.0.2/myappn)
(SID_NAME = myappn)
)
)

SID_LIST_LISTENER_DB =
(SID_LIST =
(SID_DESC =
(ORACLE_HOME = /u01/PRD/oracle/12.1.0.2/myappn)
(SID_NAME = myappn)
)
)

LISTENER =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = dataguardfsfohost001 )(PORT = 1521))
)
(DESCRIPTION =
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC1521))
)
)

ADR_BASE_LISTENER = /u01/PRD/oracle

设置主库的tnsnames.ora文件 primary instance host cat tnsnames.ora

myappn =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = dataguardfsfohost001)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = myappn)
)
)

LISTENER_myappn =
(ADDRESS = (PROTOCOL = TCP)(HOST = dataguardfsfohost001)(PORT = 1521))

myappn_DG =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = dataguardfsfohost002)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = myappn)
)
)

设置备库的listener.ora文件 standby instance host oracle@dataguardfsfohost002:/u01/PRD/oracle/12.1.0.2/myappn/network/admin> cat listener.ora
# listener.ora Network Configuration File: /u01/PRD/oracle/12.1.0.2/myappn/network/admin/listener.ora
# Generated by Oracle configuration tools.SID_LIST_LISTENER =
(SID_LIST =
(SID_DESC =
(GLOBAL_DBNAME = myappn)
(ORACLE_HOME = /u01/PRD/oracle/12.1.0.2/myappn)
(SID_NAME = myappn)
)
(SID_DESC =
(GLOBAL_DBNAME = myappn_dg_dgmgrl)
(ORACLE_HOME = /u01/PRD/oracle/12.1.0.2/myappn)
(SID_NAME = myappn)
)
)SID_LIST_LISTENER_DB =
(SID_LIST =
(SID_DESC =
(ORACLE_HOME = /u01/PRD/oracle/12.1.0.2/myappn)
(SID_NAME = myappn)
)
)

LISTENER =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = dataguardfsfohost002)(PORT = 1521))
)
(DESCRIPTION =
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC1521))
)
)

ADR_BASE_LISTENER = /u01/PRD/oracle

设置备库的tnsnames.ora文件 standby instance host oracle@dataguardfsfohost002:/u01/PRD/oracle/12.1.0.2/myappn/network/admin> cat tnsnames.ora.kk
myappn =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = dataguardfsfohost001)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = myappn)
)
)myappn_dg =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = dataguardfsfohost002)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = myappn)
)
)
启动备库数据库到nomount状态 standby instance host startup nomount 注意备库应该以pfile启动,不能用spfile启动
rman登录主库,准备开始复制数据库 primary instance host rman target sys/dji123@myappn auxiliary sys/dji123@myappn_dg
rman复制数据库 primary instance host run {
allocate channel prmy1 type disk;
allocate channel prmy2 type disk;
allocate auxiliary channel stby type disk;
duplicate target database for standby from active database nofilenamecheck
spfile
set db_unique_name=’myappn_dg’
set log_archive_max_processes=’5′
set fal_client=’myappn_dg’
set fal_server=’myappn’
set standby_file_management=’AUTO’
set log_archive_config=’DG_CONFIG=(myappn,myappn_dg)’
set log_archive_dest_1=’LOCATION=USE_DB_RECOVERY_FILE_DEST VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=myappn_dg’
set log_archive_dest_2=’SERVICE=myappn lgwr ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=myappn’
set db_create_file_dest=’/data01/PRD/oracle/12.1.0.2/myappn/dfile/myappn/’
;
}
主库和备库都启用归档日志传输,将defer状态改成enable primary instance host/standby instance host alter system set log_archive_dest_state_2=enable;
重启standby节点,恢复日志apply进度 primary instance host shutdown immediate;
startup;
alter database recover managed standby database using current logfile parallel 8 disconnect;
测试验证是否同步成功,建立测试表 primary instance host create table t1 as select sysdate as mydate from dual;
测试验证是否同步成功,查询测试表 primary instance host select to_char(mydate,’yyyy-mm-dd hh24:mi:ss’)  as mydate from t1;
dataguard状态检查 primary instance host SQL> select SWITCHOVER_STATUS from v$database;

SWITCHOVER_STATUS
——————–
SESSIONS ACTIVE

只有状态是SESSIONS ACTIVE(有活动session连接)或者TO STANDBY(无活动session连接),才可以进行切换,如果状态是RESOLVABLE GAP,则需要手工解决gap问题先。
dataguard状态检查 standby instance host SQL> select SWITCHOVER_STATUS from v$database;

SWITCHOVER_STATUS
——————–
NOT ALLOWED

切换到physical standby primary instance host SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO PHYSICAL STANDBY WITH SESSION SHUTDOWN;

Database altered.

SQL>

切换到primary standby instance host SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;

Database altered.

SQL> select SWITCHOVER_STATUS from v$database;

SWITCHOVER_STATUS
——————–
TO PRIMARY

SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY;

Database altered.

SQL>

重启备节点为primary节点 standby instance host SQL> shutdown immediate;
ORA-01109: database not openDatabase dismounted.
ORACLE instance shut down.
SQL>
SQL> startup
ORACLE instance started.
重启原primary节点为standby节点 primary instance host SQL> exit

SQL> startup
ORACLE instance started.

Total System Global Area 1185853440 bytes
Fixed Size                  2252664 bytes
Variable Size             385876104 bytes
Database Buffers          788529152 bytes
Redo Buffers                9195520 bytes
Database mounted.
Database opened.
SQL>
SQL> alter database recover managed standby database using current logfile parallel 8 disconnect;

Database altered.

SQL>

还原主备 primary instance host/standby instance host 重复上面18~23步骤
主库设置和FSFO相关的参数,并重启数据库 primary instance host alter system set dg_broker_config_file1 =’/u01/PRD/oracle/12.1.0.2/myappn/dbs/dgbroker1.dat’ SCOPE=BOTH;
alter system set dg_broker_config_file2 =’/u01/PRD/oracle/12.1.0.2/myappn/dbs/dgbroker2.dat’ SCOPE=BOTH;
alter system reset log_archive_dest_1 scope=spfile sid=’*’;
alter system reset log_archive_dest_2 scope=spfile sid=’*’;
alter system set dg_broker_start=true scope=both;
alter database flashback on;
alter system set db_flashback_retention_target = 4320 scope=both;shutdown immediate;
startup
备库设置与FSFO相关参数,并重启数据库 standby instance host alter system set dg_broker_config_file1 =’/u01/PRD/oracle/12.1.0.2/myappn/dbs/dgbroker1.dat’ SCOPE=BOTH;
alter system set dg_broker_config_file2 =’/u01/PRD/oracle/12.1.0.2/myappn/dbs/dgbroker2.dat’ SCOPE=BOTH;
alter system reset log_archive_dest_1 scope=spfile sid=’*’;
alter system reset log_archive_dest_2 scope=spfile sid=’*’;
alter system set dg_broker_start=true scope=both;
alter database recover managed standby database cancel;
alter database flashback on;
alter system set db_flashback_retention_target = 4320 scope=both;
alter database recover managed standby database using current logfile parallel 8 disconnect;shutdown immediate;
startup
alter database recover managed standby database using current log file disconnect;
登录dgmgrl控制台,开始准备配置策略 primary instance host dgmgrl sys/dji123@myappn
配置broker策略 primary instance host DGMGRL>create configuration ‘HASolution’ as  primary database is ‘myappn’  connect identifier is ‘myappn’;
DGMGRL>add database ‘myappn_dg’ as connect   identifier is ‘myappn_dg’ ;
DGMGRL>edit database myappn set property ‘LogXptMode’=’ASYNC’;
DGMGRL>edit database myappn set property NetTimeout=60;
DGMGRL>edit database myappn_dg set property ‘LogXptMode’=’ASYNC’;
DGMGRL>edit database myappn_dg set property NetTimeout=60
DGMGRL>enable configuration;
在broker测试切换是否成功 primary instance host DGMGRL>switchover to ‘myappn_dg’;
切换原主备关系 primary instance host DGMGRL>switchover to ‘myappn’;
启动FSFSO primary instance host DGMGRL>enable fast_start failover;
配置obsever observer host dgmgrl sys/dji123@OBSRV_myappn_DGMGRL

DGMGRL>edit database myappn    set property ObserverConnectIdentifier=’OBSRV_myappn_DGMGRL’;
DGMGRL>edit database myappn_dg set property ObserverConnectIdentifier=’OBSRV_myappn_DG_DGMGRL’;

启动obsever observer host cd /u01/PRD/oracle/12.1.0.2/myappn/network/dg_observ

nohup dgmgrl -silent sys/dji123@OBSRV_myappn_DGMGRL “start observer” &

创建TAF应用连接服务 primary instance host exec dbms_service.create_service(‘prim_db’,’prim_db’);
exec dbms_service.create_service(‘stby_db’,’stby_db’);
exec dbms_service.start_service(‘prim_db’);

CREATE OR REPLACE TRIGGER startDgServices after startup on database
DECLARE
db_role VARCHAR(30);
db_open_mode VARCHAR(30);
BEGIN
SELECT DATABASE_ROLE, OPEN_MODE INTO db_role, db_open_mode FROM V$DATABASE;
IF db_role = ‘PRIMARY’ THEN
DBMS_SERVICE.START_SERVICE(‘prim_db’);
DBMS_SERVICE.STOP_SERVICE(‘stby_db’);
END IF;
IF db_role = ‘PHYSICAL STANDBY’ AND db_open_mode LIKE ‘READ ONLY%’ THEN
DBMS_SERVICE.START_SERVICE(‘stby_db’);
END IF;
END;
/

注:由FSFO自动主备切换的,备库不会完全shutdown,所以stby_db的服务不会停止,导致在原备库出现prim_db和stby_db两个服务的出现。因此在startup trigger中加入了stop stby_db service的命令。

但是却导致了,在正常重启的时候,由于不存在stby_db服务,因此alertlog中会报错:
—– Error Stack Dump —–
ORA-00604: error occurred at recursive SQL level 1
ORA-44311: service stby_db not running
ORA-06512: at “SYS.DBMS_SYS_ERROR”, line 86
ORA-06512: at “SYS.DBMS_SERVICE_ERR”, line 40
ORA-06512: at “SYS.DBMS_SERVICE”, line 421
ORA-06512: at line 8

可以忽略该报错。

 

(六)FSFO测试场景

描述 测试动作 预期结果 实际结果/备注
计划外主库实例挂掉 kill -9 pid_of_smon
或者
SQL > shutdown abort
Primary instance crash,standby instance进行切换。

切换后,原standby instance成primary instance.

在dgmgrl控制台可以看到ORA-16829: fast-start failover configuration is lagging,且原primary instance可以看到ORA-16661: the standby database needs to be reinstated

2分钟左右完成切换
计划外备库实例挂掉 kill -9 pid_of_smon
或者
SQL > shutdown abort
dgmgrl控制台主库报错ORA-16810: multiple errors or warnings detected for the database,备库报错ORA-01034: ORACLE not available 备库挂掉,应用不受影响
计划内关闭主库 SQL > shutdown immediate 主备库不切换,在dgmgrl控制台可以看到主库报错:Error: ORA-01034: ORACLE not available,备库提示信息:(*) Physical standby database 计划内维护
计划内关闭备库 SQL > shutdown immediate 主备库不切换,在dgmgrl控制台可以看到主库报错:ORA-16829: fast-start failover configuration is lagging,备库提示信息:(*) Physical standby database (disabled) 计划内维护
重新启动失败数据库实例 SQL > startup 启动原primary库之后,自动变成physical standby角色。在完成Clearing online log之后,进行mrp的介质恢复。

在dgmgrl控制台,看到原primary状态从ORA-16661: the standby database needs to be reinstated,变成ORA-16657: reinstatement of database in progress,继而变成ORA-16829: fast-start failover configuration is lagging,待lag补齐之后最终变成Physical standby database

重启的原primary库恢复正常大约5~10分钟,重启期间应用完全不受影响。
主库监听挂掉 kill -9 pid_of_listener 应用无法连接。

主库和备库之间如果没有lag,处于standby redolog的recovery,则不受影响。

主库和备库之间如果有lag,则当备库启动时,需要使用archivelog的recovery,就会报错
ORA-01196: file n is inconsistent due to a failed media recovery session。Dgmgrl控制台报错备库 ORA-16664: unable to receive the result from a database

监听挂掉,应用无法连接数据库,需要手工启动监听。

注:已经添加自动重启监听脚本,发现监听宕掉,自动拉起监听。

备库监听挂掉 kill -9 pid_of_listener 应用正常连接。

主库和备库之间如果没有lag,处于standby redolog的recovery,则不受影响。

主库和备库之间如果有lag,dgmgrl控制台主库报错:ORA-16810: multiple errors or warnings detected for the database,备库报错:ORA-12541: TNS:no listener
备库启动时,需要使用archivelog的recovery,就会报错
ORA-01196: file n is inconsistent due to a failed media recovery session。Dgmgrl控制台报错备库 ORA-16664: unable to receive the result from a database

应用连接正常,需要手工启动备库监听

注:已经添加自动重启监听脚本,发现监听宕掉,自动拉起监听。

计划内切换主备库 DGMGRL> switchover to ‘myappn_dg’; 在切换期间,应用连接报错ORA-16456: switchover to standby in progress or completed。

切换后,应用恢复正常。

大约2分钟左右完成切换

 

(七)备份和其他

需求项 操作环境 需求细节内容 说明
数据库备份配置 primary instance host/standby instance host 每天1天进行数据库备库,具体脚本内容见右。 00 01 * * * /home/oracle/dba_script/backup.sh
listener watchlog primary instance host/standby instance host 监控listener进程,每分钟检查一次,如果发现宕掉,自动拉起listener。 * * * * * /home/oracle/dba_script/watchdog_lnsr.sh

 

(八)应用用户初始化:

需求项 操作环境 需求细节内容 说明
创建读写应用用户 primary instance host create user appuser_rw identified by “password1”;
创建只读应用用户 primary instance host create user appuser_ro identified by “password2”;
授权读写用户基本权限 primary instance host grant connect,resource,unlimited tablespace to appuser_rw;
授权只写用户基本权限 primary instance host grant create session to appuser_ro;
创建读写角色 primary instance host create role app_role_rw;
创建只写角色 primary instance host create role app_role_ro;
将除resource之外的权限给读写角色 primary instance host grant
CREATE JOB,
CREATE MATERIALIZED VIEW,
CREATE PUBLIC DATABASE LINK,
CREATE PUBLIC SYNONYM,
CREATE VIEW,
ALTER SESSION
to app_role_rw;
将select any table给只读用户 primary instance host grant SELECT ANY TABLE to app_role_ro;
将读写角色给读写用户 primary instance host grant app_role_rw to appuser_rw;
将只读角色给只读用户 primary instance host grant app_role_ro to appuser_ro;
将用户密码过期,要求用户第一次登录的时候自己修改密码。 primary instance host alter user appuser_rw password expire;
alter user appuser_ro password expire;

======== END ========



如果觉得文章好,欢迎打赏:
pay

相关文章

发表评论

电子邮件地址不会被公开。 必填项已用*标注