记录一次database upgrade 导致physical standby故障

upgrade from 10.2.0.5->11.2.0.3

单节点的database升级很容易,严格按照手册,修改compatible=10.2.0

参考文档 Complete Checklist for Manual Upgrades to 11gR2 [ID 837570.1]

如果带有standby的系统升级要注意以下几点:

1.首先stop 10g listener,使用 11g listener

2.使用11g software打开standby database

3,开启日志apply

由于没有仔细看文档,升级primary的时候 没有打开standby 备库处于shutdown 状态,想primary升级完成后开启sql apply,报如下错误

Errors in file /data/oracle/diag/rdbms/paystd/paystd/trace/paystd_pr00_17447.trc (incident=40247):
ORA-00353: log corruption near block 6144 change 2561976 time 02/08/2012 00:00:15
ORA-00334: archived log: ‘/data/oracle/oradata/paystd/arch/1_78_770564180.dbf’
Incident details in: /data/oracle/diag/rdbms/paystd/paystd/incident/incdir_40247/paystd_pr00_17447_i40247.trc
Completed: alter database recover managed standby database disconnect from session using current logfile
MRP0: Background Media Recovery terminated with error 354
Errors in file /data/oracle/diag/rdbms/paystd/paystd/trace/paystd_pr00_17447.trc:
ORA-00354: corrupt redo log block header
ORA-00353: log corruption near block 6144 change 2561976 time 02/08/2012 00:00:15
ORA-00334: archived log: ‘/data/oracle/oradata/paystd/arch/1_78_770564180.dbf’
Managed Standby Recovery not using Real Time Apply
Wed Feb 08 09:53:02 2012
Sweep [inc][40247]: completed
Wed Feb 08 09:53:02 2012
Sweep [inc2][40247]: completed

[oracle@db57 ~]$ vi /data/oracle/diag/rdbms/paystd/paystd/incident/incdir_40247/paystd_pr00_17447_i40247.trc

*** 2012-02-08 09:53:01.725
*** SESSION ID:(765.15) 2012-02-08 09:53:01.725
*** CLIENT ID:() 2012-02-08 09:53:01.725
*** SERVICE NAME:() 2012-02-08 09:53:01.725
*** MODULE NAME:() 2012-02-08 09:53:01.725
*** ACTION NAME:() 2012-02-08 09:53:01.725

Dump continued from file: /data/oracle/diag/rdbms/paystd/paystd/trace/paystd_pr00_17447.trc
ORA-00353: log corruption near block 6144 change 2561976 time 02/08/2012 00:00:15
ORA-00334: archived log: ‘/data/oracle/oradata/paystd/arch/1_78_770564180.dbf’

========= Dump for incident 40247 (ORA 353 [6144] [2561976]) ========

*** 2012-02-08 09:53:01.725
dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0)
—– SQL Statement (None) —–
Current SQL information unavailable – no cursor.

—– Call Stack Trace —–

可以看到standby 无法读取 block 6144 的内容 而此时的情况是 如果直接使用real-time-apply 则报错

临时解决方案:

重建primary端的redo log:

使用循环 drop inactive group 之后 standby 恢复正常:


alter database recover managed standby database disconnect using current logfile
Attempt to start background Managed Standby Recovery process (paystd)
Wed Feb 08 10:25:46 2012
MRP0 started with pid=26, OS id=18107
MRP0: Background Managed Standby Recovery process started (paystd)
started logmerger process
Wed Feb 08 10:25:51 2012
Managed Standby Recovery starting Real Time Apply
Parallel Media Recovery started with 12 slaves
Waiting for all non-current ORLs to be archived…
All non-current ORLs have been archived.
Media Recovery Log /data/oracle/oradata/paystd/arch/1_95_770564180.dbf
Media Recovery Waiting for thread 1 sequence 96 (in transit)
Recovery of Online Redo Log: Thread 1 Group 6 Seq 96 Reading mem 0
Mem# 0: /data/oracle/oradata/paystd/std2.log
Completed: alter database recover managed standby database disconnect using current logfile
Wed Feb 08 19:45:46 2012
RFS[1]: Selected log 5 for thread 1 sequence 97 dbid 1905896596 branch 770564180
Wed Feb 08 19:45:46 2012
Media Recovery Waiting for thread 1 sequence 97 (in transit)
Recovery of Online Redo Log: Thread 1 Group 5 Seq 97 Reading mem 0
Mem# 0: /data/oracle/oradata/paystd/std1.log
Wed Feb 08 19:45:46 2012
Archived Log entry 24 added for thread 1 sequence 96 ID 0x7198f794 dest 1:

没有按照文档做导致的匪夷所思的错误,至于standby 为何无法直接读取primary 端的redo内容,有待后续查证