Tuesday, April 27, 2010

Overview of Transparent application Failover (TAF) in Oracle

Transparent Application Failover (TAF) is a client-side feature that allows for clients to reconnect to surviving nodes in the event of a failure of an instance. The reconnect happens automatically from within the OCI (Oracle Call Interface) library. Any uncommitted transactions are rolled back and server side program variables and session properties will be lost. In some case the select statements automatically re-executed on the new connection with the cursor positioned on the row on which it was positioned prior to the failover.

For high availability and scalability, Oracle provides the Transparent Application Failover feature part of Oracle Real Application Clusters (RAC).

The failover is configured in tnsnames.ora file, the TAF settings are placed in CONNECT_DATA section of the tnsnames.ora using FAILOVER_MODES parameters.

FAILOVER_MODE contains the subparameters
----------------------------------------------------------
BACKUP: Specify a different net service name for backup instance connections. A backup should be specified when using PRECONNECT to pre-establish connections.

A sample configuration would look like

TESTDB1 =
(DESCRIPTION =
(LOAD_BALANCE = ON)
(FAILOVER = ON)
(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC1-VIP)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC2-VIP)(PORT = 1521))
(CONNECT_DATA =
(SERVICE_NAME = testdb.oracleracexpert.com)
(FAILOVER_MODE = (TYPE = SELECT)(METHOD = PRECONNECT)(BACKUP=TESTDB2))
)
)

TYPE: TAF supports three types of failover types

1.SESSION failover - If a user's connection is lost, SESSION failover establishes a new session automatically created for the user on the backup node. This type of failover does not attempt to recover selects. This failover is ideal for OLTP (online transaction processing) systems, where transactions are small.

2.SELECT failover – If the connection is lost, Oracle Net establishes a connection to another node and re-executes the SELECT statements with cursor positioned on the row on which it was positioned prior to the failover. This mode involves overhead on the client side and Oracle NET keeps track of SELECT statements. This approach is best for data warehouse systems, where the transactions are big and complex

3.NONE: This setting is the default and failover functionality is provided. Use this setting to prevent failover.

A sample configuration would look like

TESTDB1 =
(DESCRIPTION =
(LOAD_BALANCE = ON)
(FAILOVER = ON)
(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC1-VIP)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC2-VIP)(PORT = 1521))
(CONNECT_DATA =
(SERVICE_NAME = testdb.oracleracexpert.com)
(FAILOVER_MODE = (TYPE = SELECT)(METHOD = BASIC)(RETRIES = 10)(DELAY = 5))
)
)

METHOD: This parameters determines how failover occurs from the primary node to the backup node
BASIC: Use this mode to establish connections at failover time, no work on the backup server until failover time.
PRECONNECT: Use this mode to pre-established connections. This PRECONNECT mode provides faster failover but requires that the backup instance be capable of supporting all connections from every supported instance.

RETRIES: Use this parameter to specify number of times to attempt to connect after a failover. If DELAY is specified but RETRIES is not specified, RETRIES default to five retry attempts.

DELAY: Use this parameter to Specify the amount of time in seconds to wait between connect attempts. If RETRIES is specified but DELAY is not specified, DELAY default to one second.

A sample configuration would look like

TESTDB1 =
(DESCRIPTION =
(LOAD_BALANCE = ON)
(FAILOVER = ON)
(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC1-VIP)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC2-VIP)(PORT = 1521))
(CONNECT_DATA =
(SERVICE_NAME = testdb.oracleracexpert.com)
(FAILOVER_MODE = (TYPE = SELECT)(METHOD = BASIC)(RETRIES = 10)(DELAY = 5))
)
)

Please note that you can pre-establish a connection to reduce the failover time using METHOD=PRECONNECT option.

To verify that TAF is correctly configured, you query FAILOVER_TYPE, FAILOVER_METHOD, and FAILED_OVER columns in the V$SESSION view.
SQL> SELECT MACHINE, FAILOVER_TYPE, FAILOVER_METHOD, FAILED_OVER, COUNT(*) FROM V$SESSION

Regards,
Satishbabu Gunukula
http://www.oracleracexpert.com

Thursday, April 15, 2010

In function `lcdprm':: warning: the `gets' function is dangerous and should not be used. :failed

Many users experienced Patch failure on Node2 or Remote node in RAC environment and in some cases not able to start the RAC database instances on NODE2 or remote NODE’S (if more than 2 nodes).

You may receive below errors/warning why you apply a patch

$ opatch apply
. . . . . . .
The following warnings you may see during “OPatch” execution :
OUI-67212
WARNING for re-link on remote node 'testrac02':
.........
/oracle/v10202/bin/oracle/oracle/v10202/lib//libcore10.a(lcd.o)(.text+0xb71): In function `lcdprm':: warning: the `gets' function is dangerous and should not be used. :failed
OPatch Session completed with warnings.
OPatch completed with warnings.

Solution:-
If you are able to startup the database then No action is required, please ignore the message. It is a internal code bug message reference for developer to fix the code in future versions. This issue is fixed in 11g.

If you are not able to startup the database, here are two common reasons

1.Bug 5128575 - RAC install of 10.2.0.2 does not update libknlopt.a on all nodes
Check “Unable to start RAC instance after applying patch” link to fix “Bug 5128575”
2.Re-link failed on remote nodes.
Once again Re-link the Oracle libraries on node2 or remote node.

Regards,
Satishbabu Gunukula
http://www.oracleracexpert.com