One of the essential things in the configuration of Oracle Data Guard and Oracle Data Guard Broker is where the file tnsnames.ora is placed, which by default, from some other early releases of 19c it has been defined at $ORACLE_HOME/network/admin. However, in one of the environments that I was providing some job as a consultant, I could see the following outputs from the Oracle Data Guard broker utility :
Version 19.3.0.0.0 Copyright (c) 1982, 2019, Oracle and/or its affiliates. All rights reserved. Welcome to DGMGRL, type "help" for information. DGMGRL> connect sys/xxxxxxxxxxxxxxx@TECHDBP Connected to "TECHDBP" Connected as SYSDBA. DGMGRL> show configuration; Configuration - BROKERTECH Protection Mode: MaxPerformance Members: TECHDBP - Primary database Error: ORA-16778: redo transport error for one or more members TECHSTDB - Physical standby database Warning: ORA-16853: apply lag has exceeded specified threshold Fast-Start Failover: Disabled Configuration Status: ERROR (status updated 31 seconds ago) DGMGRL> show database TECHDBP; Database - TECHDBP Role: PRIMARY Intended State: TRANSPORT-ON Instance(s): tech1 tech2 Error: ORA-16737: the redo transport service for member "TECHSTDB" has an error Database Status: ERROR DGMGRL> show database TECHDBP "statusreport"; STATUS REPORT INSTANCE_NAME SEVERITY ERROR_TEXT tech2 ERROR ORA-16737: the redo transport service for member "TECHSTDB" has an error
Just a brief explanation on the output above to make it easier to understand from now on:
Standby database has db_unique_name as : TECHSTDB
Primary database has db_unique_name as : TECHDBP
Configuration was named as: BROKERTECH
“tech1” and “tech2” are the name of the instances for the primary database TECHDBP .
And, with the output above was found the error was reporting on the instance “tech2” that it was having a problem to connect to the standby database showing “tech2 ERROR ORA-16737: the redo transport service for member “TECHSTDB” has an error.”
So then I started digging to find out what the reason was. Therefore, I connected to the server where the instance “tech2” was located and started to check the alertSID.log file as below:
]$ tail -f alert_bo1.log 2021-09-08T21:00:26.188702+02:00 rfs (PID:7142): Selected LNO:22 for T-1.S-1101 dbid 1530008807 branch 1081336999 2021-09-08T21:00:26.193743+02:00 ARC3 (PID:29273): Archived Log entry 1310 added for T-1.S-1100 ID 0x5b31c1e7 LAD:1 2021-09-08T21:04:45.459693+02:00 rfs (PID:11969): Possible network disconnect with primary database 2021-09-08T21:04:45.460291+02:00 rfs (PID:11959): Possible network disconnect with primary database 2021-09-08T21:04:45.460422+02:00 rfs (PID:30571): Possible network disconnect with primary database rfs (PID:7142): Selected LNO:21 for T-1.S-1102 dbid 1530008807 branch 1081336999 2021-09-08T21:43:36.737978+02:00
Thus, I could have a hint at the cause by the message “RFS (PID:11959): Possible network disconnect with primary database”. And, then more information from inside the database:
SQL> SELECT thread#, dest_id, gvad.status, error, fail_sequence FROM gv$archive_dest gvad, gv$instance gvi WHERE gvad.inst_id = gvi.inst_id AND destination is NOT NULL ORDER BY thread#, dest_id; THREAD# DEST_ID STATUS ---------- ---------- --------- ERROR FAIL_SEQUENCE ----------------------------------------------------------------- ------------- 1 1 VALID 0 1 2 VALID 0 2 1 VALID 0 THREAD# DEST_ID STATUS ---------- ---------- --------- ERROR FAIL_SEQUENCE ----------------------------------------------------------------- ------------- 2 2 ERROR ORA-12154: TNS:could not resolve the connect identifier specified 0
Consequently, I checked the tnsnames.ora and found out that the alias to the service that it was reporting error was added in the file and there was no issue on the listener.ora. So what could be? It was when I decided to setup the TNS_ADMIN variable as I found out that it was not configured as below:
trace]$ srvctl getenv database -d TECHDBP -t "TNS_ADMIN" TECHDBP: PRKF-1128 : Environment variable TNS_ADMIN is not defined.
You do not have to set up this variable in a server where you have only the database running. Still, after a while, I learned that this environment has many applications that occasionally were using another TNS_ADMIN to their purpose. Knowing that, I decided to setup the variable TNS_ADMIN to oracle user on the CRS of this Oracle RAC configuration as below:
~]$ srvctl setenv database -d TECHDBP -T "TNS_ADMIN=/u01/app/oracle/19.3.0/db/network/admin"
Following by a restart of the database:
~]$ srvctl stop database -d TECHDBP ~]$ srvctl start database -d TECHDBP
And then … Voilà! After the configuration, the Data Guard Broker output was changed successfully, and it started to operate normally:
admin]$ dgmgrl DGMGRL for Linux: Release 19.0.0.0.0 - Production on Wed Sep 8 21:52:48 2021 Version 19.3.0.0.0 Copyright (c) 1982, 2019, Oracle and/or its affiliates. All rights reserved. Welcome to DGMGRL, type "help" for information. DGMGRL> connect sys/xxxx@TECHDBP Connected to "TECHDBP" Connected as SYSDBA. DGMGRL> show configuration; Configuration - BROKERTECH Protection Mode: MaxPerformance Members: TECHDBP - Primary database TECHSTDB - Physical standby database Fast-Start Failover: Disabled Configuration Status: SUCCESS (status updated 33 seconds ago) DGMGRL>
I hope that this post helps you!
Hi! I am Bruno, a Brazilian born and bred, and I am also a naturalized Swedish citizen. I am a former Oracle ACE and, to keep up with academic research, I am a Computer Scientist with an MSc in Data Science and another MSc in Software Engineering. I have over ten years of experience working with companies such as IBM, Epico Tech, and Playtech across three different countries (Brazil, Hungary, and Sweden), and I have joined projects remotely in many others. I am super excited to share my interests in Databases, Cybersecurity, Cloud, Data Science, Data Engineering, Big Data, AI, Programming, Software Engineering, and data in general.
(Continue reading)