Thursday, September 3, 2009

ORA-29740: evicted by member 0, group incarnation

I have seen RAC node eviction error in common on RAC Databases running on Oracle 10g R2 (10.2.0.1). Here is the details explanation and solution.

Node Eviction in RAC causes the Instance hang or restart.
Check for Possible Error messages in “Database/ASM” alert.log and trace files.

Alert.log:
IPC Send timeout detected.Sender: ospid 25102
Receiver: inst 2 binc 860622349 ospid 12543
IPC Send timeout to 1.2 inc 10 for msg type 36 from opid 44
Communications reconfiguration: instance_number 2
Trace dumping is performing id=[cdmp_20090819220537]
Waiting for clusterware split-brain resolution
Errors in file /oracle/10201/admin/testdb/bdump/testdb_lmon_3433.trc:
ORA-29740: evicted by member 0, group incarnation 12
LMON: terminating instance due to error 29740

Trace file generated for ospid 25102 (1st line in alert.log):
(13190 <- 30904)SKGXPDOAINVALCON: connection 0x2a9754b730 scoono 0x15694aec acconn 0x399413e2 getting closed. inactive: threshold: 0x4bff6 (13190 <- 30904)SKGXPDOAINVALCON: WARN: potential problem in keep alive connection protocol LMON Trace file:
GES IPC: Receivers 3 Senders 3
GES IPC: Buffers Receive 1000 Send (i:1050 b:1050) Reserve 301
kjxgmrcfg: Reconfiguration started, reason 1
kjxgmcs: Setting state to 0 0.
kjxgrrcfgchk: Initiating reconfig, reason 3
kjxgmrcfg: Reconfiguration started, reason 3
kjxgrrecp2: Waiting for split-brain resolution, upd 0, seq 12

If you find these symptoms then you are hitting a bug 4631662.

Cause: Due to Bug 4631662, you will see instance evictions caused by network timeouts. This bug is caused by a failure in "ach reaping", "ach reaping" reduces the packet size being sent to a receiver.
Solution:
1. Upgrade database to 10.2.0.2, this feature is disabled in 10.2.0.2.
2. Modify the below parameter to disable the feature.

DATABASE:
Run the below command if you are using spfile.
SQL> alter system set "_skgxp_udp_ach_reaping_time"=0 sid='*';

If you are using pfile add the below line to ini.ora parameter
*._skgxp_udp_ach_reaping_time = 0

ASM: Add the following lines to init.ora parameter.
*._disable_instance_params_check = TRUE
*._skgxp_udp_ach_reaping_time = 0

Click here to see the instructions to create pfile or spfile and apply the changes.

Regards,
Satishbabu Gunukula
http://www.oracleracexpert.com/

2 comments: