DSDB Data Guard Reference
1. Why Data Guard
o Complete Data Protection, High Availability, and Disaster Recovery
o Balance data availability against performance using 3 data protection
mode:
Maximum availability
Maximum protection
Maximum performance
o Centralize and simplify management
o Safeguard against physical corruptions
2. Data Guard Architecture
o Primary database
o Standby database
o Log transport service
o Log apply service
o Gap detection and resolution
o Role management services
3. Log Transport Services
o LGWR. Log writer can write to the online redo log, archive log and
and redo log for standby database, which can be local or remote.
o ARCn. Archiver can write to archive log and log for standby database.
o FAL (fetch archive log) client
o FAL server
4. Log Apply Services
o RFS (remote file server): receives redo logs from the primary database
either in the form of archived redo logs or standby redo logs.
o Archiver (ARCn): If standby redo logs are being used, the ARCn
process archives the standby redo logs that are to be applied by the
managed recovery process (MRP).
o MRP (managed recovery process): applied information from the
archived redo logs or standby redo logs to the standby database.
5. Gap Detection and Resolution
o Automatic Gap Resolution
As the redo logs being sent to the standby log compared with
the last one received by the RFS process
In 9.2, arch process at primary site will poll the logs sequence
on the standby site.
o Fal Gap Resolution
As the RFS process on the standby receives an archived log, it
updates the standby controlfile with info the log file. One the
MRP sees the update to the controlfile, it will recover the file.
If the MRP process finds the archived log is missing or corrupt,
FAL is called to resolved the gap or obtain a new copy.
FAL_SERVER and FAL_CLIENT
o Manually Resolving Gap
Find the missing logs.
Manually copy over the log files.
Register the log file to the standby database. (alter database
register log file ‘file_name’;
Restart the MRP.
6. Data Guard Configuration Setup
o The data guard setup is almost the same as the old standby database:
restore the database from primary to standby, generate the standby
controlfile from primary db and copy it into standby controlfile. There
are some data guard related init parameters need to be set up.
o Sample primary database init parameters
log_archive_dest_1=
"location=/fcapracwtc1/oracle/admin/fcapp1/arch/fcapp1"
log_archive_dest_2=
'service="(description=(address=(protocol=tcp)
(host=fcapdbmmk1)(port=1521))
(connect_data=(service_name=fcapdr1)))" optional lgwr
async affirm'
log_archive_dest_state_1= enable
log_archive_dest_state_2= enable
remote_archive_enable = true
local_listener= '(ADDRESS=(PROTOCOL=tcp)
(HOST=fcapracwtc1)(PORT=1521))'
standby_file_management= auto
o Sample standby database init parameters
log_archive_dest_1 =
('location=/fcapdbmmk1/oracle/admin/fcapdr1/arch/fcapdr1
')
log_archive_dest_state_1 = enable
log_archive_format = %r_%t_%s.log
log_archive_start = true
standby_archive_dest =
"/fcapdbmmk1/oracle/admin/fcapdr1/arch/fcapdr1"
standby_file_management = auto
remote_archive_enable = true
log_archive_trace = 127
fal_server = fcapp1,fcapp2
fal_client = fcapdr1
o After the init parameter file is ready, mount the database in standby
mode and add standby redo log files if necessary. The standby redo log
files should have the same number and size as primary redo log files.
o Put the standby database in managed recovery mode.
Alter database recover managed standby database disconnect
o Put the primary database in maximum performance mode and set force
logging.
alter database set standby database to maximize availability;
alter database force logging
7. Data guard switchover procedure
The switchover is a graceful way to change the role of the primary database
and data guard database. It will not lose any committed transaction during the
switch and can be used in controlled DR scenario.
o ON THE PRIMARY SITE:
Quiesce the database and get ready for the switchover
alter system set job_queue_processes=0;
alter system set aq_tm_processes=0;
(optional) If the primary database is RAC enabled, leave only
the primary instance up and shut down all the other instances.
Switch the primary database into data guard mode. The
following command will try to disconnect all the active sessions, stop
the processing on the primary database, make sure all the online redo
log files get sent over to the remote data guard site before committing
to data guard mode. Depending on the activity of the database right
before we do the switch, it will take anywhere between 20 minutes to 2
hours to finish.
alter database commit to switchover to physical standby with session
shutdown;
• replace the primary init parameter file with the data guard init
parameter file and mount database in data guard mode.
shutdown immediate
startup nomount
alter database mount standby database
alter database recover managed standby database disconnect;
o ON THE DATA GUARD SITE.
(optional) check on the v$database view and make sure you got
“SWITCHOVER PENDING” status in switchover_status column..
Lots of time we will get “SESSIONS ACTIVE” status back. It is still
OK to move forward with switching the data guard database to
primary.
Select switchover_status from v$database;
(optional) If this is RAC enabled, make sure only the primary
instance is up and running. Otherwise the following switch command
will fail.
Get the old data guard database to finish applying the
archivelogs and switch over to primary database. If this is the first time
to do the switchover, the online redo logs needs to be generated and
you might get log files not exist. We can ignore those errors. Depends
on the number of archivelogs need to be applied and the total size of
online redo log files since all the online redo log files will be reset.
Alter database commit to switchover to primary with session
shutdown;
Replace the data guard init parameter file with the primary init
parameter and recycle the database.
8. Data Guard Database Failover Procedure
This is when the primary database is completely lost during some unforeseeable
event and the only thing we have is the data guard database.
First finish applying the log files that has not been applied to
the data guard databass.
Alter database recover managed standby database finish;
If a standby redo log is not used, the above command may fail.
Then we can use following command:
alter database recover managed standby database finish skip standby
logfile;
Then we can switch the database to primary:
Alter database commit to switchover to primary;
If the switchover failed, then try to activate the database:
alter database activate standby database;
o After the switch or activation finish, an instance recycle is
needed before turning the database over to the application:
(optional) If you use temporary tempfile and the
dba_temp_files view is empty, add tempfile accordingly.
Alter tablespace estm01 add tempfile ‘/dev/vx/rdsk/Otm01’ size 1m;
9. Disable the Data Guard database from Primary database
In case of network issue, the primary database may get stopped
or hung. Issue the following command to free it up:
Alter system set log_archive_dest_state_2 = defer;
o After the network issue is resolved, we need to re-establish the
Data Guard setting: Sometimes we see this command doesn’t work in
RAC database and the instance recycle is needed.
Alter system set log_archive_dest_state_2 = enable;
(optional) If the standby database is RAC enabled, don’t bring
up the second instance. It won’t be needed for the Data Guard process and
will prevent the first instance from finishing the switching to primary
database if needs to.
10. OPEN DATA GUARD DATABASE AS READONLY
DATABASE
o If we need to open a read only standby database at a particular time:
Alter database recover managed standby database cancel;
Alter database open read only;
11. Common Issues and Recommendations
• Try to use maximum performance mode and lgwr async or even
arch option with the remote destination when the data concurrency is not a
high priority during the DR time. The maximum availability mode and
lgwr sync option can incur anywhere from 100% to 400% overhead to the
primary database, depending on the network and the nature of the
application. If we do have to set up the maximum availability mode and
lgwr sync option, make sure the application is very happy with
performance and there won’t be significant increase with the workload in
the foreseeable future.
• The following underscore parameter was introduced as of 9.2.0.5
to allow archiver write to local destination first before it try to write to the
remote destination. This will free up the primary database from waiting on
the slow transfer of the archive log over the remote network.
_LOG_ARCHIVE_CALLOUT='LOCAL_FIRST=TRUE'
• We can use multiple archiver if the network is so slow that the
remote archiving can never keep up with the load of the primary database.
LOG_ARCHIVE_MAX_PROCESSES = n
• Following init parameter can be implemented to reduce contention
with log writer process.
_lgwr_async_io = FALSE
• Change the SDU size to the maximum 32k from default 2k can help
improve the remote archiving performance. The following setting can be put
in the sqlnet.ora file starting 9.2.0.4.
DEFAULT_SDU_SIZE = 32767
• ARCHIVE_LAG_TARGET limits the amount of data that can be
lost and effectively increases the availability of the standby database by
forcing a log switch after a user-specified time period elapses.
archive_lag_target= 3600