Wednesday, July 27, 2016

Tuning Node Evictions 11gR2

Tuning # disktimeout,reboottime,misscount 

[root@tnc1 ~]# crsctl get css disktimeout
CRS-4678: Successful get disktimeout 200 for Cluster Synchronization Services
The maximum amount of time, allowed for css writing to voting file to join cluster is defined with “disktimeout” parameter. Default value is 200 seconds  

[root@tnc1 ~]# crsctl get css reboottime
CRS-4678: Successful get reboottime 3 for Cluster Synchronization Services
Default value is 3 seconds, (The amount of time allowed for a node to complete a reboot after the css daemon has been evicted 
This indicates, how long machine will completely shutdown when you do a reboot)

[root@tnc1 ~]# crsctl get css misscount
CRS-4678: Successful get misscount 30 for Cluster Synchronization Services
misscount #  
Misscount can be considered as network latency in seconds
One node will interact with another node through interconnect,
If the node is unable to communicate until “misscount time” the node evicts from cluster,
which we consider as failure in network heartbeat.
The value of css misscount should be lessthan disktimeout. 
In 11GR2 , misscount is to set 30 seconds 

Before 11gR2, we used to shut down cluster, and tuning the disktimeout, misscount and reboottime. From 11gR2, we can change values without shutting down RAC cluster # using (Doc ID 284752.1)
1) Execute crsctl as root to modify the misscount:
$CRS_HOME/bin/crsctl set css misscount <n>    #### where <n> is the maximum private network latency in seconds
$CRS_HOME/bin/crsctl set css reboottime <r> [-force]  #### (<r> is seconds)
$CRS_HOME/bin/crsctl set css disktimeout <d> [-force] #### (<d> is seconds) 

--Nikhil Tatineni--
--11gR2, 12c Cluster -- 

Querys to monitor RAC

following few  Query's will help to find out culprits-  Query to check long running transaction from last 8 hours  Col Sid Fo...