Monday, August 30, 2010

How to find Master Node in Oracle RAC

I have seen many users asking how to find “Master node” in Oracle RAC, let me clear some of their doubts.

There are two types of Masters in Oracle RAC, one is Mater node at Oracle Clusterware level and other is Master node for specific resource or block or object.

The node which gets the active state during startup is authorized to be a master node by Cluster Synchronization Service.

Run the below command to find which node is master at Clusterware level
$cat $ORA_CRS_HOME/log/`hostname`/cssd/ocssd* |grep master
or
$ for x in `ls -tr $ORA_CRS_HOME/log/`hostname`/cssd/ocssd* `; do grep -i "master node" $x ; done | tail -1

The OCR Automatic backups are taken only by master node. If the Master fails, the OCR backups will be created on the new Master. The Master node which has OCR backups goes down due to failure then we cannot be recover the OCR that’s why Oracle recommends taking backups using “ocrconfig” and also integrating OCR backups with backup strategy.

Run the below command to find which node is OCR Master and taking automatic backups.
$ocrconfig –showbackup
testrac02 2010/08/30 16:29:52 /oracle/crs/cdata/crs
testrac02 2010/08/30 16:29:52 /oracle/crs/cdata/crs
testrac02 2010/08/30 12:29:49 /oracle/crs/cdata/crs
testrac02 2010/08/30 08:29:46 /oracle/crs/cdata/crs
testrac02 2010/08/29 00:29:23 /oracle/crs/cdata/crs

The block level masters are used by Cache fusion while transferring the block. Any node can become the master node of a particular block and you can also see which node acting as master in V$GES_RESOURCE table (MASTER_NODE column)

You can manually remaster an object with oradebug command:
SQL> oradebug lkdebug -m pkey "object_id"

Regards
Satishbabu Gunukula
http://www.oracleracexpert.com

Thursday, July 29, 2010

Oracle Heterogeneous Services

Oracle offers two solutions to access any number of non-Oracle systems from an Oracle environment in a heterogeneously distributed environment. They are

1. Oracle Generic Connectivity
2. Oracle Transparent Gateways.

Heterogeneous Connectivity Process - The Oracle Database server accesses the non-Oracle Database using Oracle Heterogeneous Services in conjunction with an agent. Heterogeneous Services (HS) is an integrated component within the Oracle Database server and the enabling technology for the current suite of Oracle Transparent Gateway products.

Oracle Generic Connectivity is a generic solution for any ODBC or OLEDB compliant non-Oracle system. It enables connectivity using industry standards such as ODBC and OLEDB driver. Using Generic connectivity it possible to access low-end data stores such as Access, dBase, FoxPro and non-relational targets like Excel. Generic connectivity is provided free of charge with the database server, but it is and limited in functionality and slower compared to Oracle's Gateways.

If Oracle database running on windows, setup the HS ODBC/HS OLEDB agent on the same server to access non-oracle systems. If Oracle database running on UNIX, you must install Oracle server on the windows where non-oracle database is running and setup the HS ODBC/HS OLEDB agent on the same server. If you don’t want to install Oracle server on non-oracle system then you can Install Oracle server on middle tier windows system or system that supports the ODBC or OLE DB protocols and setup the HS agent.

See the below links to setup Oracle Generic connectivity.
Oracle Generic connectivity setup in Oracle 9i
http://download-west.oracle.com/docs/cd/B10501_01/server.920/a96544/gencon.htm#1656
Oracle Generic connectivity setup in Oracle 10g
http://download-uk.oracle.com/docs/cd/B19306_01/server.102/b14232/gencon.htm#g1012648

Oracle Transparent Gateways is another solution to access non-oracle database and the HS agent is a system-specific application. For example, if you include a MSSQL Sybase database in an Oracle Database distributed system then you need to obtain a Sybase-specific transparent gateway so that the Oracle Database in the system can communicate with it.

Oracle transparent Gateways are licensed separately and agents can be installed on any machine. They can be on the same machine as the Oracle database system or on the same machine as the non-Oracle system or on a third machine as a stand alone.

Following are dedicated Gateways for non-Oracle data sources:-
- Microsoft SQL*Server - Database Gateway for SQL*Server (DG4Msql)
- Sybase - Database Gateway for Sybase (DG4Sybase)
- Informix - Database Gateway for Informix (DG4Ifmx)
- IBM DB2 - Database Gateway for DRDA (DG4DRDA)
- Teradata - Database Gateway for Teradata (DG4Teradata)
- Websphere MQ - Database Gateway for Websphere MQ (DG4MQ)
- Remote online transaction processors (OLTPs) - Database Gateway for APPC (DG4APPC)
- IMS - Database Gateway for IMS (DG4IMS)
- VSAM - Database Gateway for VSAM (Dg4VSAM)
- Adabas - Database Gateway for Adabas (DG4Adabas)

See the below links to setup Transparent Gateways.
Oracle Transparent Gateways setup in Oracle 9i
http://download-west.oracle.com/docs/cd/B10501_01/server.920/a96544/admin.htm#170863
Oracle Transparent Gateways setup in Oracle 10g
http://download-uk.oracle.com/docs/cd/B19306_01/server.102/b14232/admin.htm

Non-Oracle data sources for which a dedicated transparent gateway is not available can be accessed by using Database Gateway for ODBC (DG4ODBC), which uses third party ODBC drivers to make the connection to the non-Oracle data source

Database Gateway for ODBC (DG4ODBC) is the 11g version of the generic connectivity and certified for older versions (9i, 10g). However, this does require that the Oracle 10g database accessing non-Oracle database through it must be patched to at least 10.2.0.4.

See the below link for Heterogeneous connectivity setup for Oracle Database Gateways
http://www.oracle.com/pls/db111/portal.portal_db?selected=15

There are differences in licensing and functionality of the Database Gateway for ODBC and the other gateways. For more information see the Metalink notes.
Note.232482.1 Gateway and Generic Connectivity Licensing Considerations
Note.252364.1 Functional Differences between Generic Connectivity and Database Gateways

The earlier gateway versions are de-supported. For more information see the Metalink notes.
Note.549796.1 Desupport of Oracle Transparent Gateways

Regards,
Satishbabu Gunukula
http://www.oracleracexpert.com

Wednesday, June 30, 2010

Modifying the VIP Address or VIP Hostname in Oracle RAC Node

Oracle 10g or 11g uses Virtual IP address (VIP) in clustered environment for clients to connect to the database. During the installation of Oracle Clusterware users are prompted to enter VIP and VIP hostname for each node in the cluster. The VIP is a static IP with a hostname defined and also resolved through DNS.

The VIP information is stored in OCR (Oracle Cluster Registry) and also in different HA framework. Changing the VIP Address or VIP Hostname involves modification of the nodeapps, which includes the VIP,GSD, Listener, and ONS(Oracle Notification Services).The VIP can be changed while the nodeapps are running, but the changes will take effect only when nodeapps are restated.

Note that stopping nodeapps may cause other resources to be stopped for ex: - ASM, instance or database, so the change should be made during scheduled outage.

Follow the steps to change the VIP address or VIP hostname.

Step 1:- Check the original configuration before change
$ srvctl config nodeapps -n -a
Using '-a' will give you the current VIP hostname, VIP address and interface

Example:
# srvctl config nodeapps -n testrac01 -a
VIP exists.: /testrac01-vip/10.11.12.01/255.255.255.0/eth0

The VIP Hostname is 'testrac01-vip’
The VIP IP address is '10.11.12.01'
The VIP subnet mask is '255.255.255.0'
The Interface Name used by the VIP is called 'eth0'

Step 2:- Stop Instance,ASM,Nodeapps resources
$srvctl stop instance -d testdb -i tetdb1
$srvctl stop asm -n testrac01
$srvctl stop nodeapps -n testrac01

Step 3:- Verify the VIP Address is no longer running by using below command
$ifconfig -a

You can also check the resources status using crs_stat command.

Step 4:- Update /etc/hosts file with new VIP Address or VIP hostname on node1 and also update DNS to associate the new IP address with VIP hostname as per /etc/hosts file.

Step 5:- Modify VIP Address or VIP hostname on nodeapps by using srvctl command (Run as root)
#srvctl modify nodeapps -n [-o ] [-A ]
Where
-n < node_name> - Node name.
-o - Oracle Home for the cluster software (CRS-Home).
-A The node level VIP address (/netmask[/if1[|if2|...]]).

Example:- Modify the VIP Address to 10.11.12.11
#srvctl modify nodeapps -n testrac01 -A 10.11.12.11/255.255.255.0/eth0

Use below command to change to change VIP address using VIP hostname.The srvctl command will resolve the IP to hostname or the hostname to IP address. You can use the same command to change the VIP hostname from ”testrac01-vip” to “testrac01-v”

#srvctl modify nodeapps -n testrac01 -A testrac01-v/255.255.255.0/eth0

Step 6:- Verify the change by running below command
$srvctl config nodeapps -n testrac01 –a

Step7:- Start all resources
$srvct start nodeapps -n testrac01
$srvctl start asm -n testrac01
$srvctl start instance -d testdb –i testdb1

Step8:- Repeat the same steps on all remaining nodes in the cluster.

Regards,
Satishbabu Gunukula
http://www.oracleracexpert.com

Friday, June 4, 2010

Remove Grid control Agents or Targets from Repository

Many users facing issues while removing the Targets or Agents from Grid Control GUI. After removing the target, the Grid control GUI still shows the removed agent. The reason is the Repository still has reference of removed Targets or Agents.

Follow the manual steps to remove the Agent or Target completely from repository:
1. Make sure that agent/target stopped
For ex:- $ emctl stop agent
2. Connect to the Grid Control Repository as SYSMAN
3. Run below command to list all registered targets from OEM
SQL> select target_name from mgmt_targets where target_type=’oracle_emd’;
4. Remove the Agent/Target from the list using below command.
SQL> exec mgmt_admin.cleanup_agent(‘Target1:3673');

You may receive below error while running “mgmt_admin.cleanup_agent”

1. ERROR at line 1:
ORA-20206: Target does not exist: Target1:3673'
ORA-06512: at "SYSMAN.MGMT_ADMIN", line 846
ORA-06512: at line 1

Check for errors in emoms.trace file, which are referring to removed agent.
[AJPRequestHandler-ApplicationServerThread-19] ERROR eml.OMSHandshake java.? - OMSHandshake failed.(AGENT URL = https://agenthostname.domainname:port/emd/main/)(ERROR = KEY_MISMATCH)

Run the following EMDIag kit fix to resolve the issue
$ repvfy verify loaders -test 700 –fix

2. “ORA-20206: Target does not exist” Error message on mgmt_admin.cleanup_agent and there is no target in sysman.mgmt_targets, but there is a record in sysman.mgmt_targets_delete without delete_complite_time set.

If anyone comes across such situation then run below command for each target without delete_complete_time set.
SQL > Exec mgmt_admin.delete_target_internal(Target_name,Target_type)

3. Ora 20221: Insufficient privileges: /WITH_ADMIN
ORA-6512: at "SYSMAN.MGMT_ADMIN",
line 400 ORA-6512: at line 1

This issue can occur in 10.1.0.3 due to a BUG 3574400. Download and apply patch 3574400 to resolve the issue.

Regards,
Satishbabu Gunukula
http://www.oracleracexpert.com

Thursday, May 27, 2010

Controlfile and Server parameter file (spfile) autobackup

RMAN can be configured to automatically backup the control file and server parameter file (spfile) whenever the database structure in the control file changes and whenever a backup record is added. The autobackup feature enables RMAN to recover the database even if the current control file, catalog, and server parameter file are lost.

The RMAN can search the autobackup path and restore the server parameter file from backup. Once the instance is started with restored spfile, use RMAN to restore the controlfile from autobackup. After you mount the DB using restored control file, use RMAN repository information from control file to restore and recover the data files.

You can turn ON or OFF the autobackup feature by using the following commands:
RMAN>CONFIGURE CONTROLFILE AUTOBACKUP ON;
RMAN>CONFIGURE CONTROLFILE AUTOBACKUP OFF;

If control file autobackups is ON and the backup includes datafile 1, RMAN writes the control file and SPFILE to a separate backup set. If autobackup is OFF and the backup includes datafile 1, then RMAN includes the current control file and SPFILE in the same backup set as the datafiles. After backup completion the database writes a message containing the complete path of the backup piece and the device type to the alert log.

The control file autobackup filename has a default format of %F for all device types, so that the RMAN can guess the file location and restore the controlfile without a repository. All autobackup formats must include the %F variable.

The format %F translates into c-IIIIIIIIII-YYYYMMDD-QQ, where
IIIIIIIIII - stands for DBID.
YYYYMMDD - Time stamp of the day the backup is generated
QQ is the hex sequence starts with 00 and has a maximum of FF

Use the following command to configure the Control file Autobackup format
RMAN> CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '/oradata/cf _spfile_%F';

Use following command to write to an Automatic Storage Management disk group
RMAN> CONFIGURE CONTROLFILE AUTOBACKUP FOR DEVICE TYPE DISK TO '+DGROUP1/%F';

Use the following commands to clear control file autobackup formats for a device:
RMAN>CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK CLEAR;
RMAN>CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE sbt CLEAR;

You can use SET CONTROLFILE AUTOBACKUP FORMAT command to override the configured autobackup format at session level
RMAN> SET CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '/backup/cf_spfile_%F';
RMAN> BACKUP DATABASE;

If you have configured control file autobackup, you do not need a recovery catalog or target database control file to restore the control file in case if you lost all your control files.

Use the below command to restore the control file autobackup.
RMAN> RESTORE CONTROLFILE FROM AUTOBACKUP;

Regards
Satishbabu Gunukula
http://www.oracleracexpert.com

Tuesday, April 27, 2010

Overview of Transparent application Failover (TAF) in Oracle

Transparent Application Failover (TAF) is a client-side feature that allows for clients to reconnect to surviving nodes in the event of a failure of an instance. The reconnect happens automatically from within the OCI (Oracle Call Interface) library. Any uncommitted transactions are rolled back and server side program variables and session properties will be lost. In some case the select statements automatically re-executed on the new connection with the cursor positioned on the row on which it was positioned prior to the failover.

For high availability and scalability, Oracle provides the Transparent Application Failover feature part of Oracle Real Application Clusters (RAC).

The failover is configured in tnsnames.ora file, the TAF settings are placed in CONNECT_DATA section of the tnsnames.ora using FAILOVER_MODES parameters.

FAILOVER_MODE contains the subparameters
----------------------------------------------------------
BACKUP: Specify a different net service name for backup instance connections. A backup should be specified when using PRECONNECT to pre-establish connections.

A sample configuration would look like

TESTDB1 =
(DESCRIPTION =
(LOAD_BALANCE = ON)
(FAILOVER = ON)
(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC1-VIP)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC2-VIP)(PORT = 1521))
(CONNECT_DATA =
(SERVICE_NAME = testdb.oracleracexpert.com)
(FAILOVER_MODE = (TYPE = SELECT)(METHOD = PRECONNECT)(BACKUP=TESTDB2))
)
)

TYPE: TAF supports three types of failover types

1.SESSION failover - If a user's connection is lost, SESSION failover establishes a new session automatically created for the user on the backup node. This type of failover does not attempt to recover selects. This failover is ideal for OLTP (online transaction processing) systems, where transactions are small.

2.SELECT failover – If the connection is lost, Oracle Net establishes a connection to another node and re-executes the SELECT statements with cursor positioned on the row on which it was positioned prior to the failover. This mode involves overhead on the client side and Oracle NET keeps track of SELECT statements. This approach is best for data warehouse systems, where the transactions are big and complex

3.NONE: This setting is the default and failover functionality is provided. Use this setting to prevent failover.

A sample configuration would look like

TESTDB1 =
(DESCRIPTION =
(LOAD_BALANCE = ON)
(FAILOVER = ON)
(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC1-VIP)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC2-VIP)(PORT = 1521))
(CONNECT_DATA =
(SERVICE_NAME = testdb.oracleracexpert.com)
(FAILOVER_MODE = (TYPE = SELECT)(METHOD = BASIC)(RETRIES = 10)(DELAY = 5))
)
)

METHOD: This parameters determines how failover occurs from the primary node to the backup node
BASIC: Use this mode to establish connections at failover time, no work on the backup server until failover time.
PRECONNECT: Use this mode to pre-established connections. This PRECONNECT mode provides faster failover but requires that the backup instance be capable of supporting all connections from every supported instance.

RETRIES: Use this parameter to specify number of times to attempt to connect after a failover. If DELAY is specified but RETRIES is not specified, RETRIES default to five retry attempts.

DELAY: Use this parameter to Specify the amount of time in seconds to wait between connect attempts. If RETRIES is specified but DELAY is not specified, DELAY default to one second.

A sample configuration would look like

TESTDB1 =
(DESCRIPTION =
(LOAD_BALANCE = ON)
(FAILOVER = ON)
(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC1-VIP)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC2-VIP)(PORT = 1521))
(CONNECT_DATA =
(SERVICE_NAME = testdb.oracleracexpert.com)
(FAILOVER_MODE = (TYPE = SELECT)(METHOD = BASIC)(RETRIES = 10)(DELAY = 5))
)
)

Please note that you can pre-establish a connection to reduce the failover time using METHOD=PRECONNECT option.

To verify that TAF is correctly configured, you query FAILOVER_TYPE, FAILOVER_METHOD, and FAILED_OVER columns in the V$SESSION view.
SQL> SELECT MACHINE, FAILOVER_TYPE, FAILOVER_METHOD, FAILED_OVER, COUNT(*) FROM V$SESSION

Regards,
Satishbabu Gunukula
http://www.oracleracexpert.com

Thursday, April 15, 2010

In function `lcdprm':: warning: the `gets' function is dangerous and should not be used. :failed

Many users experienced Patch failure on Node2 or Remote node in RAC environment and in some cases not able to start the RAC database instances on NODE2 or remote NODE’S (if more than 2 nodes).

You may receive below errors/warning why you apply a patch

$ opatch apply
. . . . . . .
The following warnings you may see during “OPatch” execution :
OUI-67212
WARNING for re-link on remote node 'testrac02':
.........
/oracle/v10202/bin/oracle/oracle/v10202/lib//libcore10.a(lcd.o)(.text+0xb71): In function `lcdprm':: warning: the `gets' function is dangerous and should not be used. :failed
OPatch Session completed with warnings.
OPatch completed with warnings.

Solution:-
If you are able to startup the database then No action is required, please ignore the message. It is a internal code bug message reference for developer to fix the code in future versions. This issue is fixed in 11g.

If you are not able to startup the database, here are two common reasons

1.Bug 5128575 - RAC install of 10.2.0.2 does not update libknlopt.a on all nodes
Check “Unable to start RAC instance after applying patch” link to fix “Bug 5128575”
2.Re-link failed on remote nodes.
Once again Re-link the Oracle libraries on node2 or remote node.

Regards,
Satishbabu Gunukula
http://www.oracleracexpert.com

Wednesday, March 10, 2010

Oracle Flashback Drop and Recycle bin

The Oracle 10g provides the ability to reinstating an accidentally dropped table, which is called Flashback Drop.

When a table is dropped, the database does not immediately remove the space associated with that table. Instead, the table is renamed to a system-defined name and placed in the Recycle bin. The flashback drop operations recovers from recycle bin.

SQL> DROP TABLE employee_tbl;

You can see the system-defined name of the table.
SQL> select * from tab;
TNAME TABTYPE CLUSTERID
------------------------------ ------- ----------
BIN$gXxxELu7aQ/gQAoKd5l2Hg==$0 TABLE

You can see the dropped table in the recycle bin using
SQL> show Recyclebin
ORIGINAL NAME RECYCLEBIN NAME OBJECT TYPE DROP TIME
-------------- ------------------------------ ------------ ----------------
EMPLOYEE_TBL BIN$gXxxELu7aQ/gQAoKd5l2Hg==$0 TABLE 2010-03-01:09:10:00

Recover the dropped table from Recycle bin using
SQL> FLASHBACK TABLE employee_tbl TO BEFORE DROP;

Recover the dropped table from Recycle bin with new name using
SQL> FLASHBACK TABLE employee_tbl TO BEFORE DROP RENAME TO employee_tmp;

You can also drop a table completely with out needing a flashback using
SQL> DROP TABLE employee_tbl PURGE;

The tablespace will not free up space until the table will remain until recycle bin is purged. You can purge table explicitly from recycle bin using
SQL> PURGE TABLE employee_tbl;

Purge all tables in recycle bin using
SQL> PURGE RECYCLEBIN;

As a DBA can purge all the objects in any tablespace using
SQL> PURGE DBA_RECYCLEBIN;

Purge all the objects in recycle bin in a tablespace using
PURGE TABLESPACE employee_tbs;

Purge all the objects in recycle bin in a tablespace specific to a user only using
PURGE TABLESPACE employee_tbs USER emp_user;

Regards,
Satishbabu Gunukula
http://www.oracleracexpert.com

Flashback Table Feature in Oracle 10g

In Oracle 9i Database, we have concept of Flashback Query option to retrieve data from a point in time in the past. The Oracle 10g provides the ability to recover a table or set of tables to a specified point in time in the past, this concept is called “Flashback table”.

Oracle Flashback Table operation is very quick and you do not have to restore any data from backups, and the rest of your database remains available while the Flashback Table operation is being performed.

The Flashback table depends on Undo information retained in the undo tablespace. If you set UNDO_RETENTION=1 hr, Oracle will not overwritten the data in undo tablespace until 1 hr. User can recover from their mistakes until specified time only.

Flashback table feature has some prerequisites:
•Row movement must be enabled on the table.
SQL> ALTER TABLE table_name ENABLE ROW MOVEMENT;
•You must have SELECT, INSERT, DELETE, and ALTER privileges on the table.
•You must have FLASHBACK ANY TABLE privilege or the FLASHBACK object privilege on the table.

Use below commands to restore the table to its state when the database was at the time specified by SCN or point in time.
SQL> FLASHBACK TABLE employee_tbl TO SCN 786;
or
SQL> FLASHBACK TABLE employee_tbl TO TIMESTAMP TO_TIMESTAMP ('2010-03-01 09:00:00', 'YYYY-MM-DD HH:MI:SS')

Regards
Satishbabu Gunukula
http://www.oracleracexpert.com

Tuesday, March 9, 2010

Unable to start RAC instance after applying a patch

Many users experienced “ORA-07445” error after applying any patch on 10.2.0.2 version and unable to start the RAC database instances on NODE2 or remote NODE’S (if more than 2 nodes). This also leads to dumps or internal errors on the remote nodes.

ORA-07445: exception encountered: core dump [kkxsyn()+584] [SIGSEGV]


The issues is a known bug 5128575 and the effected version are >=10.2.0.2

Check for possible Error messages in ASM/Database alert.log and trace files on NODE2 or Remote nodes. If you find the below SYMPTOMS then you are hitting a BUG 5128575.

ASM alert.log errors:
Errors in file /oracle/v10202/admin/+ASM/udump/+asm2_ora_30841.trc:
ORA-07445: exception encountered: core dump [kkxsyn()+584] [SIGSEGV] [Address not mapped to object] [0x000000168] [] []

Trace file errors:
Exception signal: 11 (SIGSEGV), code: 1 (Address not mapped to object), addr: 0x168, PC: [0x3f06748, kkxsyn()+584]
ksedmp: internal or fatal error
ORA-07445: exception encountered: core dump [kkxsyn()+584] [SIGSEGV] [Address not mapped to object] [0x000000168] [] []
No current SQL statement being executed.

Database errors:
Startup database fails with below errors
SQL> startup database
ORA-01078: failure in processing system parameters
ORA-01565: error in identifying file '+DATA/testdb/spfiletestdb.ora'
ORA-17503: ksfdopn:2 Failed to open file +DATA/testdb/spfiletestdb.ora
ORA-03113: end-of-file on communication channel

Cause:
Installing the 10.2.0.2 Patch Set in RAC on any Unix platform does not correctly update the libknlopt.a file on all remote nodes. The local node where the installer is run does update libknlopt.a file and remote nodes do not get the updated file.

Workaround:
Step 1: Shutdown ASM and all Database instances
Step 2: Manually copy $ORACLE_HOME/rdbms/lib/libknlopt.a from the local node to NODE2 or all remote nodes (if more than 2 nodes)
$ scp libknlopt.a oracle@testrac2:$ORACLE_HOME/rdbms/lib/libknlopt.a
Step 3: Re-link Oracle on all remote nodes.
3. cd $ORACLE_HOME/rdbms/lib
make -f ins_rdbms.mk ioracle

Regards,
Satishbabu Gunukula
http://www.oracleracexpert.com

Friday, March 5, 2010

Flash Recovery Area (FRA) in Oracle 10g

Flash recovery area is a disk location in which the database can store and manage files related to Backup and Recovery. To setup a flash recovery area, you must choose a directory location or Automatic Storage Management disk group to hold the files. FRA cannot be stored on a raw file system.

Flash recovery area simplifies the administration of your database by automatically naming recovery-related files, retaining the files as long as they are needed for restore and recovery activities, and deleting the files when they are no longer needed to restore your database and space is needed for some other backup and recovery-related purpose.

To Setup Flash Recovery Area (FRA), you just need to specify below two parameters.
1. DB_RECOVERY_FILE_DEST_SIZE (Specifies max space to use for FRA)
2. DB_RECOVERY_FILE_DEST (Location of FRA)

The DB_RECOVERY_FILE_DEST_SIZE must be set before DB_RECOVERY_FILE_DEST.

You can initialize these parameters by adding them in “init.ora” file or setting them using ALTER SYSTEM SET command or DBCA (Database Configuration Assistant).
For ex:-
SQL> ALTER SYSTEM SET db_recovery_file_dest_size=10G scope=both
SQL> ALTER SYSTEM SET db_recovery_file_dest='/oradata/FRA;
For example, If FRA is a Automatic Storage Management (ASM) disk group
SQL> ALTER SYSTEM SET db_recovery_file_dest='+FLASH’sid='*’ ( sid=’*’, if using RAC)

In a RAC database, all database instances MUST have the same values for these parameters.

Disable the Flash Recovery Area:
To disable FRA set the DB_RECOVERY_FILE_DEST initialization parameter to a null string.
SQL> ALTER SYSTEM SET DB_RECOVERY_FILE_DEST = '' scope=both;

Monitor the Flash recovery area using Dynamic performance views
To find out the current flash recovery area location, disk quota and current usage in flash recovery area query V$RECOVERY_FILE_DEST.

SQL> SELECT * FROM V$RECOVERY_FILE_DEST;
NAME SPACE_LIMIT SPACE_USED SPACE_RECLAIMABLE NUMBER_OF_FILES
------ ----------- ----------- ----------------- -------------
FLASH 10737418240 5368709120 256000 28

The $FLASH_RECOVERY_AREA_USAGE views shows the percentage of the total disk quota used by different types files, and how much space for each type of file can be reclaimed by deleting files that are obsolete, redundant, or already backed up to tape.

SQL> SELECT * FROM V$FLASH_RECOVERY_AREA_USAGE;
FILE_TYPE PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES
------------ ------------------ ------------------------- ---------------
CONTROLFILE 2 0 4
ONLINELOG 6 0 8
ARCHIVELOG 8.05 1.11 20
BACKUPPIECE 33.94 5.46 32
IMAGECOPY 5.64 1.23 10
FLASHBACKLOG .06 0 1

Files Stored in the Flash Recovery Area
PERMANENT Files - Multiplexed copies of Control files and online redo log files
TRANSIENT Files - Archived redo logs not yet backed up, Datafile copies, Incremental backups, Contolfile auto backups (which include spfile)

Only TRANSIENT files will be deleted from flash recovery area based up on following rules:
•Files those are obsolete under the configured RMAN retention policy.
•Transient files that are copied to tape.
•In a Data Guard environment, archived redo log files can be deleted based up on archived redo log deletion policy.

Errors:
ORA-19809: limit exceeded for recovery files
Cause: The limit for recovery files specified by the DB_RECOVERY_FILE_DEST_SIZE was exceeded.
Action: The error is accompanied by 19804. See message 19804 for further details

ORA-19804: cannot reclaim nnnnn bytes disk space from mmmmm limit
Cause: Oracle cannot reclaim disk space of specified bytes from the DB_RECOVERY_FILE_DEST_SIZE limit.
Action: There are five possible solutions:
1) Take frequent backup of recovery area using RMAN.
2) Consider changing RMAN retention policy.
3) Consider changing RMAN archivelog deletion policy.
4) Add disk space and increase DB_RECOVERY_FILE_DEST_SIZE.
5) Delete files from recovery area using RMAN.

ORA-38786: Flash recovery area is not enabled.
Cause: An attempt was made to perform a command that requires Flash recovery area to be enabled.
Action: Set DB_RECOVERY_FILE_DEST to an appropriate location and retry

Regards,
Satishbabu Gunukula
http://www.oracleracexpert.com

Tuesday, February 16, 2010

ORA-600 [kfcNullConvert20] error on ASM instance

I have experienced ORA-600 error on ASM instance (10.2.0.2), which caused ASM instance restart on that particular node. Due to which all databases instances on that node are restarted, because they lost connectivity with ASM.

This issue is a known Bug 4682861 in Oracle and the effected version are 10.1.0.4 ,10.1.0.5 ,10.2.0.1 ,10.2.0.2. This BUG is fixed in 10.2.0.3 Server patch set and 11.1.0.6 Base release.

Check for Possible Error messages in ASM/Database alert.log and trace files.
If you find below SYMPTOMS then you are hitting a BUG 4682861.

ASM alert. log errors:
ORA-00600: internal error code, arguments: [kfcNullConvert20], [], [], [], [], [], [], []
DBW0: terminating instance due to error 471

Trace file (ASM DBWR trace file contains Stack Trace: kfcNullConvert and errors):-
error 600 detected in background process
ORA-00600: internal error code, arguments: [kfcNullConvert20], [], [], [], [], [], [], []
ksuitm: waiting for [5] seconds before killing DIAG

Database alert log errors:
ORA-15064: communication failure with ASM instance
ORA-03113: end-of-file on communication channel
ASMB: terminating instance due to error 15064

Cause:
Due to Bug 4682861, ASM instance restarted.

Solution:
Option 1:- Upgrade to 10.2.0.3 or higher release
Option 2:-Download the patch from MetaLink:
i) Click on Patches & Updates Link.
ii) Enter patch number: 4682861 and Select your O/S
iii) Click Go.
iv) Download the patch for the Oracle release that you experienced this issue.

Please note that ANY ORA-600 error indicates Oracle has detected an internal inconsistency or a problem which it doesn’t know how best to address. These are *NOT* necessarily bugs and can occur for reasons such as running out of some resource, Operating System IO problems.. etc.

Regards,
Satishbabu Gunukula
http://www.oracleracexpert.com

Tuesday, January 26, 2010

Oracle RAC load balancing and failover

LOAD BALANCING: The Oracle RAC system can distribute the load over many nodes this feature called as load balancing.

There are two methods of load balancing
1.Client load balancing
2.Server load balancing

Client Load Balancing distributes new connections among Oracle RAC nodes so that no one server is overloaded with connection requests and it is configured at net service name level by providing multiple descriptions in a description list or multiple addresses in an address list. For example, if connection fails over to another node in case of failure, the client load balancing ensures that the redirected connections are distributed among the other nodes in the RAC.

Configure Client-side connect-time load balancing by setting LOAD_BALANCE=ON in the corresponding client side TNS entry.

TESTRAC =
(DESCRIPTION =
(ADDRESS_LIST=
(LOAD_BALANCE = ON)
(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC1-VIP)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC2-VIP)(PORT = 1521))
)
(CONNECT_DATA = (SERVICE_NAME = testdb.oracleracexpert.com))
)

Server Load Balancing distributes processing workload among Oracle RAC nodes. It divides the connection load evenly between all available listeners and distributes new user session connection requests to the least loaded listener(s) based on the total number of sessions which are already connected. Each listener communicates with the other listener(s) via each database instance’s PMON process.

Configure Server-side connect-time load balancing feature by setting REMOTE_LISTENERS initialization parameter of each instance to a TNS name that describes list of all available listeners.

TESTRAC_LISTENERS =
(DESCRIPTION =
(ADDRESS_LIST =(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC1)(PORT = 1521)))
(ADDRESS_LIST =(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC2)(PORT = 1521))))
)

Set *.remote_listener= TESTRAC_LISTENERS’ initialization parameter in the database’s shared SPFILE and add TESTRAC_LISTENERS’ entry to the TNSNAMES.ORA file in the Oracle Home of each node in the cluster.

Once you configure Server-side connect-time load balancing, each database’s PMON process will automatically register the database with the database’s local listener as well as cross-register the database with the listeners on all other nodes in the cluster. Now the nodes themselves decide which node is least busy, and then will connect the client to that node.

FAILOVER:

The Oracle RAC system can protect against failures caused by O/S or server crashes or hardware failures. When a node failure occurs in RAC system, the connection attempts can fail over to other surviving nodes in the cluster this feature called as Failover.

There are two methods of failover
1. Connection Failover
2. Transparent Application Failover (TAF)

Connection Failover - If a connection failure occurs at connect time, the application failover the connection to another active node in the cluster. This feature enables client to connect to another listener if the initial connection to the first listener fails.

Enable client-side connect-time Failover by setting FAILOVER=ON in the corresponding client side TNS entry.

TESTRAC =
(DESCRIPTION =
(ADDRESS_LIST=
(LOAD_BALANCE = ON)
(FAILOVER = ON)
(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC1-VIP)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC2-VIP)(PORT = 1521))
)
(CONNECT_DATA = (SERVICE_NAME = testdb.oracleracexpert.com))
)

If LOAD_BALANCE is set to on then clients randomly attempt connections to any nodes. If client made connection attempt to a down node, the client needs to wait until it receives the information that the node is not accessible before trying alternate address in ADDRESS_LIST.

Transparent application Failover (TAF) – If connection failure occurs after a connection is established, the connection fails over to other surviving nodes. Any uncommitted transactions are rolled back and server side program variables and session properties will be lost. In some case the select statements automatically re-executed on the new connection with the cursor positioned on the row on which it was positioned prior to the failover.

TESTRAC =
(DESCRIPTION =
(LOAD_BALANCE = ON)
(FAILOVER = ON)
(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC1-VIP)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = TESTRAC1-VIP)(PORT = 1521))
(CONNECT_DATA =
(SERVICE_NAME = testdb.oracleracexpert.com)
(FAILOVER_MODE = (TYPE = SELECT)(METHOD = BASIC)(RETRIES = 180)(DELAY = 5))
)
)

Regards,
Satishbabu Gunukula
http://www.oracleracexpert.com

Delete duplicate rows from Oracle Table:

There are several techniques to delete duplicate rows from a table, but the most effective way is to join the table against itself. Always make sure to select the duplicate data before you delete using below queries.


1. Delete duplicate rows based on one column value using ROWID

SQL> delete from dup_table t1 where rowid > ( select min(rowid) from dup_table t2 where t1.ename = t2.ename);

-- or --
SQL> delete from dup_table t1 where rowid < ( select max(rowid) from dup_table t2 where t1.ename = t2.ename);


2. Use the below query to delete the rows suing Oracle analytic functions

SQL> Delete from dup_table

where rowid in ( select rowid from

( select rowid , row_number() over (partition by col1 order by upper col1 ) row_num from dup_table ) where rno > 1 );


3. You must specify all columns that make the row duplicate in the query, use the below query to delete duplicate records based on two columns or composite unique key

SQL> delete from dup_table t1

where rowid > (select min(rowid) from dup_table t2

where upper(t2.col1) = upper(t1.col1)

and upper(t2.col2) = upper(t1.col2)

);

-- or --

SQL> delete from dup_table t1

where rowid < (select max(rowid) FROM dup_table t2

where t1.col1=t2.col1 AND t1.col2=t2.col2 );

-- or --

SQL> delete from dup_table t1

where rowid <> ( select max(rowid) from dup_table t2

where t2.col1 = t1.col1

and t2.col2 = t1.col2 )


4. If the fields match on the NULL value then duplicate fails to remove the duplicate rows. In this situation add a null check

SQL> delete from dup_table t1
where t1.rowid > ANY (select t2.rowid FROM dup_table t2
where (t1.col1 = t2.col1 OR (t1.col1 is null AND t2.col1 is null))
and
(t1.col2 = t2.col2 OR (t1.col2 is null AND t2.col2 is null))
);

If the table contains duplicate data in upper case and lower case, use below query to delete to delete the data

SQL> delete from dup_table

where rowid in ( select rid from ( select rowid rid, row_number() over (partition by upper(col1) order by upper(col2)) rno from dup_table )

where rno > 1

);


Regards,
Satishbabu Gunukula
http://www.oracleracexpert.com