Oracle Dataguard Support Issues: Brian Hitchcock Ocp 10G Dba Sun Microsystems
Oracle Dataguard Support Issues: Brian Hitchcock Ocp 10G Dba Sun Microsystems
Support Issues
Brian Hitchcock
OCP 10g DBA
Sun Microsystems
brian.hitchcock@sun.com
brhora@aol.com
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 1
Oracle DataGuard
Focus on Logical Standby Support Issues
Maintains a standby database
– Archived redo logs on primary
– Sent to standby and applied
Simple idea
Many configuration options
Can become very complicated
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 2
DataGuard
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 3
DataGuard Errors
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 4
Create Physical Standby
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 5
Create Physical Standby
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 6
Convert to Logical Standby
On Primary database
– Build LogMiner dictionary
On Standby database
– Stop redo apply
Errors, no impact
– Convert database to logical standby
Two trace files
– Restart db
– Open resetlogs
– Verify logical standby working
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 7
Logical Standby is Running
Business requirements
– Standby frozen most of the day
– Standby catches up once per day
Alert log messages while catching up
Disk space for archived redo logs
Other issues
– Apply process is slow
How to detect, resolve
– Primary versus Standby backups
Impact, resolution
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 8
Logical Standby is Running
Other Issues
– Constraint violations
Errors, resolution
– No data found
Errors, resolution
– ORA-16211
Errors, Oracle Support
– Primary db XDB schema issues
Fixed on primary, errors on standby
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 9
Logical Standby is Running
Other issues
– ORA-07445
Refresh cures all
– Refresh process
After refresh
– ORA-16211: unsupported record found in the archived redo log
– Compile invalid objects
– Import into standby database
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 10
Primary Can’t Connect
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 11
SYS Password Issue
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 12
SYS Password Issue
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 13
DataGuard Trace File
Physical Standby
– Start log apply process
– Trace file created
– Stops when log apply process stops
See file contents later…
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 14
Can’t ‘See’ Redo Logs
Physical Standby
Creating or Refreshing standby
– Primary configured, sending redo logs
– Standby not yet created/running
Standby may not register redo logs
– Our scripts maintain primary archived redo logs
Compress to save disk space, delete after 2 days
Manually register
– Alter database register logfile ‘<logfile>’;
– DataGuard applies redo log
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 15
Can’t ‘See’ Redo Logs
BRHBETA> select * from v$archive_gap;
SEQUENCE# APP
---------- ---
1956 YES
1957 YES
1958 YES
1977 NO 1959 thru 1976 on standby
1978 NO
1979 NO
1980 NO
1981 NO
1982 NO
1983 NO
1984 NO
11 rows selected.
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 16
Redo Logs Not Deleted
Physical Standby
– After applied to standby
– Unlike logical standby
SQL apply process does delete them
Use RMAN
Possible disk space issues on standby
– How long will you need to store redo logs?
– Not an issue if converting to logical soon
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 17
Stop Physical Standby
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 18
Turn Off Apply Process
Physical standby
– Turn off apply process
Regular maintenance
Converting to Logical Standby
– Generates ‘error’
Why is this an error?
Typical of DataGuard
Everything seems to be an ‘error’
Even when it is perfectly routine
Makes support more difficult
When is an error something to worry about?
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 19
Standby Alert Log
Tue Oct 9 16:34:34 2007
Physical Standby Database mounted.
Completed: ALTER DATABASE MOUNT
Tue Oct 9 16:34:36 2007
alter database recover managed standby database disconnect from session
Tue Oct 9 16:34:36 2007
Attempt to start background Managed Standby Recovery process (BRHBETA)
MRP0 started with pid=11, OS id=13474
Tue Oct 9 16:34:36 2007
MRP0: Background Managed Standby Recovery process started (BRHBETA)
Managed Standby Recovery not using Real Time Apply
parallel recovery started with 7 processes
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 20
Standby Alert Log
Wed Oct 10 10:15:15 2007
alter database recover managed standby database cancel
Wed Oct 10 10:15:19 2007
MRP0: Background Media Recovery cancelled with status 16037
Wed Oct 10 10:15:19 2007
Errors in file /orahome01/admin/BRHBETA/bdump/brhbeta_mrp0_13474.trc:
ORA-16037: user requested cancel of managed recovery operation
Recovery interrupted!
Wed Oct 10 10:15:20 2007
Errors in file /orahome01/admin/BRHBETA/bdump/brhbeta_mrp0_13474.trc:
ORA-16037: user requested cancel of managed recovery operation
Wed Oct 10 10:15:20 2007
MRP0: Background Media Recovery process shutdown (BRHBETA)
Wed Oct 10 10:15:21 2007
Managed Standby Recovery Canceled (BRHBETA)
Wed Oct 10 10:15:21 2007
Completed: alter database recover managed standby database cancel
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 21
Trace File
$ more /orahome01/admin/BRHBETA/bdump/brhbeta_mrp0_13474.trc
/orahome01/admin/BRHBETA/bdump/brhbeta_mrp0_13474.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.2.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
ORACLE_HOME = /orahome01/product/10.2.0
System name: SunOS
Node name: brh-beta1-zone04
Release: 5.10
Version: Generic_118833-36
Machine: sun4u
Instance name: BRHBETA
Redo thread mounted by this instance: 1
Oracle process number: 11
Unix process pid: 13474, image: oracle@beta1-zone04 (MRP0)
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 22
Trace File
*** 2007-10-09 16:34:41.302 1018 krsm.c
Managed Recovery: Initialization posted.
*** 2007-10-09 16:34:41.303 60639 kcrr.c
Managed Standby Recovery not using Real Time Apply
Recovery target incarnation = 2, activation ID = 0
Influx buffer limit = 27762 (50% x 55524)
Successfully allocated 7 recovery slaves
Using 158 overflow buffers per recovery slave
Start recovery at thread 1 ckpt scn 8257757517457 logseq 1956 block 5
*** 2007-10-09 16:34:42.124
Media Recovery add redo thread 1
*** 2007-10-09 16:34:42.124 1018 krsm.c Recreating redo logs
Managed Recovery: Active posted.
ORA-00367: checksum error in log file header
ORA-00305: log 1 of thread 1 inconsistent; belongs to another database
ORA-00312: online log 1 thread 1: '/shared/oralogs01/BRHBETA/redo01a.log'
*** 2007-10-09 16:34:42.147 60639 kcrr.c
Clearing online redo logfile 1 /shared/oralogs01/BRHBETA/redo01a.log
*** 2007-10-09 16:36:15.066
*** 2007-10-09 16:36:15.066 60639 kcrr.c
Clearing online redo logfile 1 complete
ORA-00367: checksum error in log file header
ORA-00305: log 2 of thread 1 inconsistent; belongs to another database
ORA-00312: online log 2 thread 1: '/shared/oralogs01/BRHBETA/redo02a.log'
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 23
Trace File
*** 2007-10-09 16:36:15.100 60639 kcrr.c
Clearing online redo logfile 2 /shared/oralogs01/BRHBETA/redo02a.log
*** 2007-10-09 16:37:51.473
*** 2007-10-09 16:37:51.473 60639 kcrr.c
Clearing online redo logfile 2 complete
ORA-00367: checksum error in log file header
ORA-00305: log 3 of thread 1 inconsistent; belongs to another database
ORA-00312: online log 3 thread 1: '/shared/oradata02/BRHBETA/redo03b.log'
*** 2007-10-09 16:37:51.479 60639 kcrr.c
Clearing online redo logfile 3 /shared/oradata02/BRHBETA/redo03b.log
*** 2007-10-09 16:39:26.048
*** 2007-10-09 16:39:26.048 60639 kcrr.c
Clearing online redo logfile 3 complete
ORA-00367: checksum error in log file header
ORA-00305: log 4 of thread 1 inconsistent; belongs to another database
ORA-00312: online log 4 thread 1: '/shared/oradata02/BRHBETA/redo04b.log'
*** 2007-10-09 16:39:26.488 60639 kcrr.c
Clearing online redo logfile 4 /shared/oradata02/BRHBETA/redo04b.log
*** 2007-10-09 16:41:00.447
*** 2007-10-09 16:41:00.447 60639 kcrr.c
Clearing online redo logfile 4 complete
*** 2007-10-09 16:41:00.469 60639 kcrr.c
Media Recovery Waiting for thread 1 sequence 1956
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 24
Trace File
*** 2007-10-09 16:41:00.469 60639 kcrr.c
Applying redo logs to physical standby
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 27
Trace File
*** 2007-10-09 17:08:31.924
Media Recovery Log /oraarch01/BRHBETA/LOG_1965_1_629245032.arc
*** 2007-10-09 17:09:12.510
Media Recovery Log /oraarch01/BRHBETA/LOG_1966_1_629245032.arc
*** 2007-10-09 17:09:21.050
Media Recovery Log /oraarch01/BRHBETA/LOG_1967_1_629245032.arc
*** 2007-10-09 17:09:40.234
Media Recovery Log /oraarch01/BRHBETA/LOG_1968_1_629245032.arc
*** 2007-10-09 17:09:45.055
Media Recovery Log /oraarch01/BRHBETA/LOG_1969_1_629245032.arc
*** 2007-10-09 17:09:50.572
Media Recovery Log /oraarch01/BRHBETA/LOG_1970_1_629245032.arc
*** 2007-10-09 17:09:58.968
Media Recovery Log /oraarch01/BRHBETA/LOG_1971_1_629245032.arc
*** 2007-10-09 17:10:03.922
Media Recovery Log /oraarch01/BRHBETA/LOG_1972_1_629245032.arc
*** 2007-10-09 17:10:13.196
Media Recovery Log /oraarch01/BRHBETA/LOG_1973_1_629245032.arc
*** 2007-10-09 17:10:21.927
Media Recovery Log /oraarch01/BRHBETA/LOG_1974_1_629245032.arc
*** 2007-10-09 17:10:34.064
Media Recovery Log /oraarch01/BRHBETA/LOG_1975_1_629245032.arc
*** 2007-10-09 17:10:42.420 60639 kcrr.c
Media Recovery Waiting for thread 1 sequence 1976
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 28
Trace File
*** 2007-10-09 17:10:42.421 60639 kcrr.c
Fetching gap sequence in thread 1, gap sequence 1976-1976
*** 2007-10-09 17:11:12.538
-----------------------------------------------------------
Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization
parameter is defined to a value that is sufficiently large
enough to maintain adequate log switch information to resolve
archivelog gaps.
-----------------------------------------------------------
*** 2007-10-09 17:12:42.563
Media Recovery Log /oraarch01/BRHBETA/LOG_1976_1_629245032.arc
*** 2007-10-09 17:12:45.563
Media Recovery Log /oraarch01/BRHBETA/LOG_1977_1_629245032.arc
*** 2007-10-09 17:12:48.534
Media Recovery Log /oraarch01/BRHBETA/LOG_1978_1_629245032.arc
*** 2007-10-09 17:13:00.505
Media Recovery Log /oraarch01/BRHBETA/LOG_1979_1_629245032.arc
*** 2007-10-09 17:13:02.054
Media Recovery Log /oraarch01/BRHBETA/LOG_1980_1_629245032.arc
*** 2007-10-09 17:13:03.231
Media Recovery Log /oraarch01/BRHBETA/LOG_1981_1_629245032.arc
*** 2007-10-09 17:13:03.902
Media Recovery Log /oraarch01/BRHBETA/LOG_1982_1_629245032.arc
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 29
Trace File
*** 2007-10-09 17:13:04.492
Media Recovery Log /oraarch01/BRHBETA/LOG_1983_1_629245032.arc
*** 2007-10-09 17:13:08.171
Media Recovery Log /oraarch01/BRHBETA/LOG_1984_1_629245032.arc
*** 2007-10-09 17:13:26.860
*** 2007-10-09 17:13:26.860 60639 kcrr.c
Media Recovery Waiting for thread 1 sequence 1985
*** 2007-10-09 17:16:07.172
Media Recovery Log /oraarch01/BRHBETA/LOG_1985_1_629245032.arc
*** 2007-10-09 17:16:08.067
Media Recovery Log /oraarch01/BRHBETA/LOG_1986_1_629245032.arc
*** 2007-10-09 17:16:08.131
Media Recovery Log /oraarch01/BRHBETA/LOG_1987_1_629245032.arc
*** 2007-10-09 17:16:08.195 60639 kcrr.c
Media Recovery Waiting for thread 1 sequence 1988
*** 2007-10-09 17:16:13.202
Media Recovery Log /oraarch01/BRHBETA/LOG_1988_1_629245032.arc
*** 2007-10-09 17:16:13.268 60639 kcrr.c
Media Recovery Waiting for thread 1 sequence 1989
*** 2007-10-09 21:14:01.119
Media Recovery Log /oraarch01/BRHBETA/LOG_1989_1_629245032.arc
*** 2007-10-09 21:14:16.922
*** 2007-10-09 21:14:16.922 60639 kcrr.c
Media Recovery Waiting for thread 1 sequence 1990
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 30
Trace File
*** 2007-10-10 09:32:33.399
*** 2007-10-10 09:32:33.399 60639 kcrr.c
Fetching gap sequence in thread 1, gap sequence 1990-1990
*** 2007-10-10 09:33:05.187
Media Recovery Log /oraarch01/BRHBETA/LOG_1990_1_629245032.arc
*** 2007-10-10 09:33:22.505
Media Recovery Log /oraarch01/BRHBETA/LOG_1991_1_629245032.arc
*** 2007-10-10 09:33:22.570
Media Recovery Log /oraarch01/BRHBETA/LOG_1992_1_629245032.arc
*** 2007-10-10 09:33:22.631
Media Recovery Log /oraarch01/BRHBETA/LOG_1993_1_629245032.arc
*** 2007-10-10 09:33:22.693
Media Recovery Log /oraarch01/BRHBETA/LOG_1994_1_629245032.arc
*** 2007-10-10 09:33:22.761
Media Recovery Log /oraarch01/BRHBETA/LOG_1995_1_629245032.arc
*** 2007-10-10 09:33:22.807
Media Recovery Log /oraarch01/BRHBETA/LOG_1996_1_629245032.arc
*** 2007-10-10 09:33:22.864
Media Recovery Log /oraarch01/BRHBETA/LOG_1997_1_629245032.arc
*** 2007-10-10 09:33:22.918
Media Recovery Log /oraarch01/BRHBETA/LOG_1998_1_629245032.arc
*** 2007-10-10 09:33:23.199
Media Recovery Log /oraarch01/BRHBETA/LOG_1999_1_629245032.arc
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 31
Trace File
*** 2007-10-10 09:33:23.255 60639 kcrr.c
Stop Log Apply Process
Ready to convert to Logical Standby
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 33
Convert to Logical Standby
SQL Apply Process
– When applying redo logs
– Generates 2 trace files
– What are they?
Trace files
– One shows start of kcrrwkx
– Second shows end of kcrrwkx
– What are these for?
– Neither show up in alert log
– Both continue as long as SQL apply process runs
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 34
First Trace File
/orahome01/admin/BRHBETA/bdump $ more brhbeta_arc0_13168.trc
/orahome01/admin/BRHBETA/bdump/brhbeta_arc0_13168.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.2.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
ORACLE_HOME = /orahome01/product/10.2.0
System name: SunOS
Node name: brh-beta1-zone04
Release: 5.10
Version: Generic_118833-36
Machine: sun4u
Instance name: BRHBETA
Redo thread mounted by this instance: 1
Oracle process number: 24
Unix process pid: 13168, image: oracle@beta1-zone04 (ARC0)
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 37
Normal Operation
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 38
Standby Catching Up
Tue Oct 16 15:13:22 2007
Completed: ALTER DATABASE STOP LOGICAL STANDBY APPLY Stop SQL Apply process
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 41
Standby Catching Up
Tue Oct 16 18:44:01 2007
LOGMINER: Log Auto Delete - deleting: /oraarch01/BRHBETA/LOG_2056_1_629245032.arc
Deleted file /oraarch01/BRHBETA/LOG_2056_1_629245032.arc
Deleting redo logs
Tue Oct 16 18:44:01 2007
LOGMINER: Log Auto Delete - deleting: /oraarch01/BRHBETA/LOG_2057_1_629245032.arc
Deleted file /oraarch01/BRHBETA/LOG_2057_1_629245032.arc
Tue Oct 16 18:44:01 2007
LOGMINER: Log Auto Delete - deleting: /oraarch01/BRHBETA/LOG_2058_1_629245032.arc
Deleted file /oraarch01/BRHBETA/LOG_2058_1_629245032.arc
...
…
…
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 42
Standby Catching Up
Tue Oct 16 18:44:15 2007 Standby is at 2087
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 43
Standby Catching Up
Unsupported DDL – Standby doesn’t execute
Tue Oct 16 19:30:58 2007
LOGSTDBY stmt: CREATE PFILE = '/tmp/datatools/BRHBETA.PFILE.19144.1192413665' FROM SPFILE =…
LOGSTDBY status: ORA-16226: DDL skipped due to lack of support
LOGSTDBY id: XID 0x0003.02d.00013e70, hSCN 0x0782.a9c2fdb8, lSCN 0x0782.a9c2fdb8, Thread 1, RBA …
LOGSTDBY stmt: create pfile='/orahome01/oradba/tmp/ora_adm_sqlbt_bkp.tmp1.17449.BRHBETA' from spfile …
LOGSTDBY status: ORA-16226: DDL skipped due to lack of support
LOGSTDBY id: XID 0x000b.001.000126cf, hSCN 0x0782.a9c2fe15, lSCN 0x0782.a9c2fe15, Thread 1, RBA…
LOGSTDBY stmt: CREATE PFILE = '/tmp/datatools/BRHBETA.PFILE.19695.1192413687' FROM SPFILE = …
LOGSTDBY status: ORA-16226: DDL skipped due to lack of support
LOGSTDBY id: XID 0x0003.00c.00013e62, hSCN 0x0782.a9c2fe4a, lSCN 0x0782.a9c2fe4a, Thread 1, RBA…
LOGSTDBY stmt: ALTER DATABASE BACKUP CONTROLFILE TO '/tmp/datatools/dtodump_…
LOGSTDBY status: ORA-16226: DDL skipped due to lack of support
LOGSTDBY id: XID 0x0009.007.00011453, hSCN 0x0782.a9c2feb4, lSCN 0x0782.a9c2feb4, Thread 1, RBA…
Tue Oct 16 19:30:58 2007
ALTER TABLESPACE "SYSTEM" BEGIN BACKUP
Completed: ALTER TABLESPACE "SYSTEM" BEGIN BACKUP
Tue Oct 16 19:30:58 2007
ALTER TABLESPACE "SYSTEM" END BACKUP
Completed: ALTER TABLESPACE "SYSTEM" END BACKUP
...
…
…
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 44
Standby Catching Up
Standby catches up at 2158
Tue Oct 16 21:29:19 2007
LOGMINER: Begin mining logfile: /oraarch01/BRHBETA/LOG_2157_1_629245032.arc
Tue Oct 16 21:30:03 2007
LOGMINER: End mining logfile: /oraarch01/BRHBETA/LOG_2157_1_629245032.arc
Tue Oct 16 21:35:52 2007
Incremental checkpoint up to RBA [0xa.f41b7.0], current log tail at RBA [0xa.f41cc.0]
Tue Oct 16 21:55:56 2007
Incremental checkpoint up to RBA [0xa.f43b5.0], current log tail at RBA [0xa.f43b5.0]
Tue Oct 16 22:11:16 2007
RFS[1]: No standby redo logfiles created
RFS[1]: Archived Log: '/oraarch01/BRHBETA/LOG_2158_1_629245032.arc'
Tue Oct 16 22:11:16 2007
RFS LogMiner: Registered logfile [/oraarch01/BRHBETA/LOG_2158_1_629245032.arc] to LogMiner session id [1]
Tue Oct 16 22:11:16 2007
LOGMINER: Begin mining logfile: /oraarch01/BRHBETA/LOG_2158_1_629245032.arc
Tue Oct 16 22:11:20 2007
LOGMINER: End mining logfile: /oraarch01/BRHBETA/LOG_2158_1_629245032.arc
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 45
Archived Redo Logs
Logical Standby
– After applied to standby
– SQL apply process does delete them
Unlike physical standby
Possible disk space issues on standby
– How long will you need to store redo logs?
– If standby frozen all day
Weekends? Holidays?
– If standby fails
How many days to fix failures?
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 46
Archived Redo Logs
How long are redo logs available on primary?
– If not on disk when needed for standby
– Recover from backup
– Dataguard may not see these redo logs
Register redo logs
Logical standby
– Also generates its own archived redo logs
– Needed to recover standby db
Unique standby db objects?
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 47
SQL Apply Process Slow
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 48
Long Running Transaction
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 49
Long Running Transaction
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 50
Long Running Transaction
alter session set nls_date_format = 'DD-Mon-YYYY hh24:mi:ss';
column first_change# format 99999999999999999999
column next_change# format 99999999999999999999
column resetlogs_change# format 99999999999999999999
select * from dba_logstdby_log;
TYPE
------------------------------
STATUS
--------------------------------------------------------------------------------------------------------------
HIGH_SCN
---------------------
COORDINATOR
ORA-16116: no work available
8257767540953
READER
ORA-16127: stalled waiting for additional transactions to be applied
8257767541085
BUILDER
ORA-16127: stalled waiting for additional transactions to be applied
8257767540965
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 51
Long Running Transaction
PREPARER
ORA-16127: stalled waiting for additional transactions to be applied
8257767540965
ANALYZER
ORA-16117: processing
8257767540953
APPLIER
ORA-16116: no work available
8257767539467
APPLIER
ORA-16116: no work available
8257767512259
APPLIER
ORA-16113: applying change to table or sequence "BRH"."XXSUN_BRH_COMPS_INT"
8257767539247
APPLIER
ORA-16116: no work available
8257767512262
9 rows selected.
BRHBETA>
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 52
Long Running Transaction
BRHBETA> select * from dba_logstdby_log;
THREAD# RESETLOGS_CHANGE# RESETLOGS_ID SEQUENCE# FIRST_CHANGE# NEXT_CHANGE#
---------- --------------------- ------------ ---------- --------------------- --------------------- -------------------------- ----------
FILE_NAME
TIMESTAMP DIC DIC APPLIED
1 8257200902826 629245032 2048 8257767447753 8257767534297 12-Oct-2007 22:41:05 12-Oct-2007 23:18:23
/oraarch01/BRHBETA/LOG_2048_1_629245032.arc
12-Oct-2007 22:19:08 NO NO CURRENT
93 rows selected.
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 53
Long Running Transaction
redo log 2049 goes from
– SCN 8257767534297 to SCN 8257767754044
Check again -- Tue Oct 16 15:04:29 MST 2007
APPLIER
ORA-16113: applying change to table or sequence "BRH"."XXSUN_BRH_COMPS_INT"
8257767540857
Compute Estimate
– Tue Oct 16 08:09:55 -- Tue Oct 16 11:17:39
– APPLIER has moved from 39247 to 39991
– 3 hours --> roughly 750 SCNs, 250 per hour
– it still needs to go from 539991 to 754044
over 200,000 SCNs -- at 250 per hour,
– this would take 800 hours --> 33 days
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 54
Long Running Transaction
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 55
Long Running Transaction
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 56
Long Running Transaction
Skip table
ALTER DATABASE STOP LOGICAL STANDBY APPLY;
EXECUTE DBMS_LOGSTDBY.SKIP -
(stmt => 'DML' , -
schema_name => 'BRH' , -
object_name => 'XXSUN_BRH_COMPS_INT', -
proc_name => null);
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 57
Primary/Standby Interactions
Logical standby backup starts
– Tablespaces put into backup mode
Apply process applies redo logs from primary
– Contain transactions for primary backup
– Tries to put tablespaces into backup mode
Apply process fails
– Wait for standby backup to finish
– Restart apply process
Disable standby backups when catching up
– Apply process runs longer than normal
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 58
Unique Constraint Violation
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 59
Unique Constraint Violation
Tue Oct 16 21:23:42 2007
LOGMINER: Begin mining logfile: /oraarch01/BRHBETA/LOG_2147_1_629245032.arc
...
...
...
Tue Oct 16 21:24:31 2007
LOGSTDBY stmt: insert into "APPLSYS"."WF_LOCAL_ROLES"
values
“COL1" = 'Value1',
“COL2" = 'Value2',
“COL3" IS NULL,
…
LOGSTDBY status: ORA-00001: unique constraint (APPLSYS.WF_LOCAL_ROLES_U1) violated
LOGSTDBY id: XID 0x0009.016.00011548, hSCN 0x0782.aa32b533, lSCN 0x0782.aa32b533, Thread 1, RBA…
Tue Oct 16 21:25:20 2007
LOGMINER: End mining logfile: /oraarch01/BRHBETA/LOG_2147_1_629245032.arc
SQL Apply Process continues processing
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 60
No Data Found
What does it mean?
– When DataGuard updates standby
Brings update from primary
Brings pre-update data from primary
– On standby, DataGuard compares
Pre-update data from primary
Current data on standby
– If they don’t agree
DataGuard won’t apply the update on standby
SQL apply process fails
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 61
No Data Found
Wed Sep 19 12:09:23 2007
LOGSTDBY stmt: update "PO"."PO_LINE_LOCATIONS_ALL"
...
...SQL, values
...
LOGSTDBY status: ORA-01403: no data found
LOGSTDBY id: XID 0x0008.01e.0000c437, hSCN 0x0789.eacde6c1, lSCN 0x0789.eacde6c1…
LOGSTDBY Apply process P007 pid=29 OS id=3447 stopped
Wed Sep 19 12:09:23 2007
Errors in file /shared/orahome01/admin/BRHPRSB/bdump/brhprsb_lsp0_12386.trc:
ORA-12801: error signaled in parallel query server P004
ORA-01403: no data found
LOGSTDBY Analyzer process P003 pid=24 OS id=3439 stopped
LOGSTDBY Apply process P006 pid=27 OS id=3445 stopped
LOGSTDBY Apply process P005 pid=26 OS id=3443 stopped
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 62
No Data Found
What happened?
– For some reason
– Table data not the same primary vs standby
How could this happen?
– Logical standby is read-write
SYS can change anything at any time
– SYS left guard status at NONE
Other db users can make changes in standby
How to fix?
– Skip table
– Refresh logical standby
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 63
No Data Found
Logical Standby
– No way to find out what happened
– No utility to verify primary, standby in synch
– Differences can exist for a long time
Won’t cause error until table updated on primary
Logical Standby for reporting?
– Can you depend on this for your reports?
– How do you know what is in the standby?
– What has been skipped?
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 64
Primary Schema Issues
Primary db
– XDB schema reinstalled
Create java class (loads java class from
filesystem)
Standby db
– Transactions came through to standby
– Standby doesn’t have java class files
– Apply process fails
– Identify and skip transaction(s)
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 65
ORA-07445 Errors
SR opened
– Results
Known bug fixed in 11g
Apply patch on standby
– Impact
None, no affect on standby
– Apply patch?
No – refresh would wipe out patch
Don’t want to patch primary db
Primary doesn’t have this error
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 66
ORA-07445 Errors
SQL Apply Process stops
Tue Oct 16 21:27:50 2007
Errors in file /orahome01/admin/BRHBETA/bdump/brhbeta_p004_6577.trc:
ORA-07445: exception encountered: core dump [krvsmso()+1212] [SIGSEGV] [Address not mapped to object]…
Tue Oct 16 21:29:06 2007
Errors in file /orahome01/admin/BRHBETA/bdump/brhbeta_lsp0_5041.trc:
ORA-12805: parallel query server died unexpectedly
Tue Oct 16 21:29:06 2007
TLCR process death detected. Shutting down TLCR Logical Standby is not for the faint of heart!
logminer process death detected, exiting logical standby
LOGSTDBY Analyzer process P003 pid=13 OS id=10051 stopped
LOGSTDBY Apply process P005 pid=15 OS id=10060 stopped
LOGSTDBY Apply process P006 pid=42 OS id=10062 stopped
LOGSTDBY Apply process P007 pid=17 OS id=10064 stopped
Tue Oct 16 21:29:06 2007
LOGSTDBY status: ORA-16222: automatic Logical Standby retry of last action
LOGSTDBY status: ORA-16111: log mining and apply setting up
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 67
ORA-07445 Errors
Tue Oct 16 21:29:07 2007
LOGMINER: Parameters summary for session# = 1
LOGMINER: Number of processes = 3, Transaction Chunk Size = 201
LOGMINER: Memory Size = 30M, Checkpoint interval = 150M
LOGMINER: session# = 1, builder process P001 started with pid=7 OS id=10018
LOGMINER: session# = 1, reader process P000 started with pid=34 OS id=10014
LOGMINER: session# = 1, preparer process P002 started with pid=36 OS id=10020
Tue Oct 16 21:29:10 2007
LOGMINER: Begin mining logfile: /oraarch01/BRHBETA/LOG_2147_1_629245032.arc
Tue Oct 16 21:29:10 2007 SQL Apply Process continues processing
LOGMINER: Turning ON Log Auto Delete
LOGSTDBY Analyzer process P003 started with pid=13 OS id=10051
LOGSTDBY Apply process P006 started with pid=42 OS id=10062
LOGSTDBY Apply process P004 started with pid=30 OS id=10219
LOGSTDBY Apply process P005 started with pid=15 OS id=10060
LOGSTDBY Apply process P007 started with pid=17 OS id=10064
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 68
Refresh Process
Export unique standby db objects
– Scripts to recreate
Backup primary db
– Create standby control file
Recover primary db backup on standby
– Use standby control file
Create physical standby
Convert to logical standby
Import unique standby db objects
– Recreate with scripts
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 69
Unsupported Record
ORA-16211
– SQL apply process fails
Must skip table or refresh standby
Oracle SR tells me to
– Add all column supplemental log group to table
– Rebuild standby
Or reinstantiate the table
– Needed for each table
Not an easy process
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 70
Unsupported Record
Thu Oct 11 10:11:58 2007
LOGMINER: Log Auto Delete - deleting: /oraarch01/BRHBETA/LOG_2005_1_629245032.arc
Deleted file /oraarch01/BRHBETA/LOG_2005_1_629245032.arc
Thu Oct 11 10:15:55 2007
** LOGMINER WARNING - Invalidated 4 LCRs **
Thu Oct 11 10:20:29 2007
LOGSTDBY stmt: "BRH"."XXSUN_INV_ITEMS_INT": unsupported
LOGSTDBY status: ORA-16211: unsupported record found in the archived redo log
ORA-06512: at "SYS.DBMS_INTERNAL_LOGSTDBY", line 4717
ORA-06512: at line 1
LOGSTDBY id: XID 0x0009.02e.0001127d, hSCN 0x0782.a9016545, lSCN 0x0782.a9016545, Thread 1…
LOGSTDBY Apply process P007 pid=23 OS id=16578 stopped
Thu Oct 11 10:20:29 2007
Errors in file /orahome01/admin/BRHBETA/bdump/brhbeta_lsp0_13625.trc:
ORA-12801: error signaled in parallel query server P007
ORA-16211: unsupported record found in the archived redo log
LOGSTDBY Analyzer process P003 pid=19 OS id=16570 stopped
LOGSTDBY Apply process P005 pid=21 OS id=16574 stopped
LOGSTDBY Apply process P006 pid=36 OS id=16576 stopped
LOGSTDBY Apply process P004 pid=34 OS id=16572 stopped
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 71
Unsupported Record
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 72
Compile Invalid Objects
In Logical Standby
– Execute utlrp.sql
– 2 hours go by
– Not much changed
Disable Guard for session
– Alter session disable guard
alter database guard standby;
– Recompile runs in 2 minutes
– Alter session enable guard
When normal things don’t work
– Perhaps guard enabled is the problem
Guard level is the problem (all vs standby)
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 73
Import Into Logical Standby
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 74
Conclusion
Logical standby
– Lots of errors
Many require refreshing standby
Lots of DBA support needed
– For all of this support
What do you have?
Do you know what is in the standby
Reporting?
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 75
Conclusion
Physical standby
– Is solid, dependable
– No issues
Logical standby
– Is it really a standby?
– Is it ready for failover?
– Is it providing complete data for reports?
– Lots of issues
– Is it worth the effort/risk?
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 76