Data Guard today
Today had noticed that redo logs were not being applied to standby. Checking live database alert log showed errors like:
PING[ARC1]: Heartbeat failed to connect to standby '(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=tcp)(HOST=drserver)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=dbase_DR)(INSTANCE_NAME=dbase)(SERVER=dedicated)))'. Error is 12514.
Wed Mar 07 10:51:10 2007
Error 12514 received logging on to the standby
Wed Mar 07 10:51:10 2007
Errors in file d:\oracle\admin\dbase\bdump\dbase_arc1_2540.trc:
ORA-12514: Message 12514 not found; No message file for product=RDBMS, facility=ORA
But they are somewhat misleading – ora-12514 implies listener issues, and the listener was actually working ok on the standby server.
Drcdbname.log had the right information:
DG 2007-03-07-07:12:42 0 2 0 RSM detected log transport problem: log transport for database 'dbase_DR' has the following error.
DG 2007-03-07-07:12:42 0 2 0 ORA-12514: Message 12514 not found; No message file for product=RDBMS, facility=ORA
DG 2007-03-07-07:12:42 0 2 0 RSM0: HEALTH CHECK ERROR: ORA-16737: the redo transport service for standby database "dbase_DR" has an error
DG 2007-03-07-07:12:42 0 2 0 NSV1: Failed to connect to remote database dbase_DR. Error is ORA-01034
DG 2007-03-07-07:12:42 0 2 0 NSV1: Message is dropped.
DG 2007-03-07-07:12:42 0 2 606440720 DMON: Database dbase_DR returned ORA-01034
DG 2007-03-07-07:12:42 0 2 606440720 for opcode = CTL_GET_STATUS, phase = BEGIN, req_id = 1.1.606440720
DG 2007-03-07-07:12:42 0 2 606440720 Operation CTL_GET_STATUS cancelled during phase 2, error = ORA-16778
DG 2007-03-07-07:12:42 0 2 606440720 Operation CTL_GET_STATUS cancelled during phase 2, error = ORA-16778
Checking Data Guard Manager on live server showed the same ORA-1034 error, with description added in:
C:\>dgmgrl
DGMGRL for 32-bit Windows: Version 10.2.0.1.0 - Production
Copyright (c) 2000, 2005, Oracle. All rights reserved.
Welcome to DGMGRL, type “help” for information.
DGMGRL> connect sys/password
Connected.
DGMGRL> show configuration;
Configuration
Name: db_DOCUMENTUM
Enabled: YES
Protection Mode: MaxPerformance
Fast-Start Failover: DISABLED
Databases:
dbase – Primary database
dbase_DR – Physical standby database
Current status for “db_DOCUMENTUM”:
Warning: ORA-16607: one or more databases have failed
DGMGRL> show database ‘dbase’;
Database
Name: dbase
Role: PRIMARY
Enabled: YES
Intended State: ONLINE
Instance(s):
dbase
Current status for “dbase”:
Error: ORA-16778: redo transport error for one or more databases
DGMGRL> show database ‘dbase_DR’;
Database
Name: dbase_DR
Role: PHYSICAL STANDBY
Enabled: YES
Intended State: ONLINE
Instance(s):
dbase
Current status for “dbase_DR”:
Error: ORA-01034: ORACLE not available
So the standby database had stopped running (although its windows service was still up and running ok). To fix, first startup nomount the standby database:
sqlplus sys/password as sysdba
SQL>
startup nomount
At this point dgmgrl on live server reported:
DGMGRL> show database 'dbase_DR';
Database
Name: dbase_DR
Role: PHYSICAL STANDBY
Enabled: YES
Intended State: ONLINE
Instance(s):
dbase
Current status for “dbase_DR”:
Error: ORA-16525: the Data Guard broker is not yet available
And secondly, again on the standby database:
SQL>
alter database mount standby database ;
Dgmgrl on live server now reported everything ok:
DGMGRL> show database 'dbase_DR';
Database
Name: dbase_DR
Role: PHYSICAL STANDBY
Enabled: YES
Intended State: ONLINE
Instance(s):
dbase
Current status for “dbase_DR”:
SUCCESS
DGMGRL> show configuration;
Configuration
Name: db_DOCUMENTUM
Enabled: YES
Protection Mode: MaxPerformance
Fast-Start Failover: DISABLED
Databases:
dbase – Primary database
dbase_DR – Physical standby database
Current status for “db_DOCUMENTUM”:
SUCCESS
DGMGRL>
Which alert logs for live and standby confirmed.
The standby alert log shows this command also:
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE THROUGH ALL SWITCHOVER DISCONNECT NODELAY
But that was issued behind the scenes by the Data Guard Broker, not issued manually by the DBA.
As an aside, 10g logical standby databases (that replay SQL rather than redo change vectors) are availalable with the same protection modes and fast start failover options as physical standby databases (the traditional type that replay redo). That is according to documentation, I haven’t had a chance to use those myself yet, but hope to be soon. The big advantage of logical standby databases is that they are available for reporting use without halting the replication. In the absence of other drawbacks (not unknown in new oracle features), that would make logical standby the preferred option.
EDIT – for older databases or otherwise where data guard is not running, commands to use would instead be like:
recover managed standby database cancel;
recover standby database;
[auto]
recover managed standby database disconnect;
Good article.
What not clear for me:
The dg broker does
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE THROUGH ALL SWITCHOVER DISCONNECT NODELAY
but you just do
recover managed standby database disconnect;