OGG学习笔记03-单向复制简单故障处理
环境:参考:
实验目的:了解OGG简单故障的基本处理思路。1. 故障现象
故障现象:启动OGG源端的extract进程,data pump进程,一段时间后发现进程均被终止。GGSCI (oradb30) 1> info allProgram Status Group Lag at Chkpt Time Since ChkptMANAGER RUNNING EXTRACT ABENDED LPJY1 00:00:00 47:39:54 EXTRACT ABENDED LXJY1 00:00:00 47:40:00 GGSCI (oradb30) 2> start extract lxjy1Sending START request to MANAGER ...EXTRACT LXJY1 startingGGSCI (oradb30) 3> info allProgram Status Group Lag at Chkpt Time Since ChkptMANAGER RUNNING EXTRACT ABENDED LPJY1 00:00:00 47:40:50 EXTRACT RUNNING LXJY1 00:00:00 47:40:55 GGSCI (oradb30) 4> start extract lpjy1Sending START request to MANAGER ...EXTRACT LPJY1 startingGGSCI (oradb30) 5> info allProgram Status Group Lag at Chkpt Time Since ChkptMANAGER RUNNING EXTRACT RUNNING LPJY1 00:00:00 47:40:58 EXTRACT RUNNING LXJY1 00:00:00 47:41:04 GGSCI (oradb30) 6> info allProgram Status Group Lag at Chkpt Time Since ChkptMANAGER RUNNING EXTRACT ABENDED LPJY1 00:00:00 47:41:15 EXTRACT RUNNING LXJY1 00:00:00 47:41:21 GGSCI (oradb30) 7> info allProgram Status Group Lag at Chkpt Time Since ChkptMANAGER RUNNING EXTRACT ABENDED LPJY1 00:00:00 47:41:19 EXTRACT RUNNING LXJY1 00:00:00 47:41:25 GGSCI (oradb30) 8> info allProgram Status Group Lag at Chkpt Time Since ChkptMANAGER RUNNING EXTRACT ABENDED LPJY1 00:00:00 47:41:41 EXTRACT ABENDED LXJY1 00:00:00 47:41:47
2. 查看日志
查看ogg日志ggserr.log, 排查进程被终止的原因。[ogg@oradb30 ogg]$ cd $GG_HOME
[ogg@oradb30 ogg]$ tail -200f ggserr.log 发现datapump进程lpjy1是因为连接不到目标OGG而终止;extract进程lxjy1是因为无法找到归档日志sequence 160 thread 1而终止。2017-01-19 14:51:46 INFO OGG-00993 Oracle GoldenGate Capture for Oracle, lpjy1.prm: EXTRACT LPJY1 started.2017-01-19 14:51:49 ERROR OGG-01224 Oracle GoldenGate Capture for Oracle, lpjy1.prm: TCP/IP error 113 (No route to host).2017-01-19 14:51:49 ERROR OGG-01668 Oracle GoldenGate Capture for Oracle, lpjy1.prm: PROCESS ABENDING.2017-01-19 14:52:28 ERROR OGG-00446 Oracle GoldenGate Capture for Oracle, lxjy1.prm: Could not find archived log for sequence 160 thread 1 under default destinations SQL
排查原因发现是归档日志被RMAN备份策略备份完成后删除了,既然有备份,那么下一步只需要从备份集中恢复日志中提示的sequence 160及其之后的日志即可。
这里,也说明配置OGG最好建议是归档模式,否则在这种目标端没有及时获取到源端在线日志的情况下,就没有办法继续应用了。3. 解决问题
对于lxjy1进程(Extract),只需要从RMAN备份集中恢复sequence 160及其之后的归档日志:$ rman target /RMAN> restore archivelog from logseq 160;
然后再启动lxjy1进程。
对于lpjy1进程(Data Pump),只需要确认已经启动目标端OGG所在主机,网通,然后启动目标端数据库和目标OGG,并启动目标OGG的mgr进程,replicat进程即可。
最终确认源端和目标端ogg所有进程均正常running:
源端OGG:GGSCI (oradb30) 1> info allProgram Status Group Lag at Chkpt Time Since ChkptMANAGER RUNNING EXTRACT RUNNING LPJY1 00:00:00 00:00:03 EXTRACT RUNNING LXJY1 00:00:00 00:00:00
目标端OGG:
GGSCI (oradb31) 1> info allProgram Status Group Lag at Chkpt Time Since ChkptMANAGER RUNNING REPLICAT RUNNING RJY1 00:00:00 00:00:01
OGG学习笔记基础篇: