Sunday, May 1, 2016

CRS-4402, CRS-2800, CRS-4000 : root.sh failed on second Node of RAC cluster

### Scenario: 
/u01/app/11.2.0/grid/root.sh  is failed on the second Node of the RAC cluster ##

[root@tnc2 ~]# /u01/app/11.2.0/grid/root.sh 
Performing root user operation for Oracle 11g 
The following environment variables are set as:
    ORACLE_OWNER= oracle
    ORACLE_HOME=  /u01/app/11.2.0/grid
Enter the full pathname of the local bin directory: [/usr/local/bin]: 
   Copying dbhome to /usr/local/bin ...
   Copying oraenv to /usr/local/bin ...
   Copying coraenv to /usr/local/bin ...
Creating /etc/oratab file...
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /u01/app/11.2.0/grid/crs/install/crsconfig_params
Creating trace directory
User ignored Prerequisites during installation
Installing Trace File Analyzer
OLR initialization - successful
Adding Clusterware entries to inittab
 CRS-4402: The CSS daemon was started in exclusive mode but found an active CSS daemon on node tnc1, number 1, and is terminating
An active cluster was found during exclusive startup, restarting to join the cluster
Start of resource "ora.crsd" failed
CRS-2800: Cannot start resource 'ora.asm' as it is already in the INTERMEDIATE state on server 'tnc2'
CRS-4000: Command Start failed, or completed with errors.
Failed to start Oracle Grid Infrastructure stack
Failed to start Cluster Ready Services at /u01/app/11.2.0/grid/crs/install/crsconfig_lib.pm line 1353.
/u01/app/11.2.0/grid/perl/bin/perl -I/u01/app/11.2.0/grid/perl/lib -I/u01/app/11.2.0/grid/crs/install /u01/app/11.2.0/grid/crs/install/rootcrs.pl execution failed

Node Alert log file shows as follows 
[/u01/app/11.2.0/grid/bin/oraagent.bin(28566)]CRS-5019:All OCR locations are on ASM disk groups [GRID], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/u01/app/11.2.0/grid/log/tnc2/agent/ohasd/oraagent_oracle/oraagent_oracle.log".
2016-04-30 16:28:08.453: 
[/u01/app/11.2.0/grid/bin/oraagent.bin(28566)]CRS-5019:All OCR locations are on ASM disk groups [GRID], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/u01/app/11.2.0/grid/log/tnc2/agent/ohasd/oraagent_oracle/oraagent_oracle.log".
2016-04-30 16:28:38.628: 
[/u01/app/11.2.0/grid/bin/oraagent.bin(28566)]CRS-5019:All OCR locations are on ASM disk groups [GRID], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/u01/app/11.2.0/grid/log/tnc2/agent/ohasd/oraagent_oracle/oraagent_oracle.log".
2016-04-30 16:29:08.727: 
[/u01/app/11.2.0/grid/bin/oraagent.bin(28566)]CRS-5019:All OCR locations are on ASM disk groups [GRID], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/u01/app/11.2.0/grid/log/tnc2/agent/ohasd/oraagent_oracle/oraagent_oracle.log".
2016-04-30 16:29:38.867: 


error in ohasd logs as follows "/u01/app/11.2.0/grid/log/tnc2/agent/ohasd/oraagent_oracle/oraagent_oracle.log" 
2016-04-30 16:53:26.893: [ora.asm][1075857728]{0:0:2} [check] checkCrsStat 2 CLSCRS_STAT ret: 184
2016-04-30 16:53:26.893: [ora.asm][1075857728]{0:0:2} [check] clsnUtils::error Exception type=2 string=
2016-04-30 16:53:26.893: [ora.asm][1075857728]{0:0:2} [check] AsmAgent::checkCbk: Exception UserErrorException
2016-04-30 16:53:26.893: [ora.asm][1075857728]{0:0:2} [check]  
2016-04-30 16:53:26.893: [ora.asm][1075857728]{0:0:2} [check] InstAgent::check 1 prev clsagfw_res_status 4 current clsagfw_res_status 4
2016-04-30 16:53:27.896: [ora.asm][1077958976]{0:0:2} [check] AsmAgent::check ocrCheck 1 m_OcrOnline 0 m_OcrTimer 161
2016-04-30 16:53:27.896: [ora.asm][1077958976]{0:0:2} [check] CrsCmd::ClscrsCmdData::stat entity 5 statflag 32 useFilter 1
2016-04-30 16:53:27.897: [ COMMCRS][1133730112]clsc_connect: (0x1f159cb0) no listener at (ADDRESS=(PROTOCOL=IPC)(KEY=CRSD_UI_SOCKET))
2016-04-30 16:53:27.897: [ora.asm][1077958976]{0:0:2} [check] checkCrsStat 2 CLSCRS_STAT ret: 184
2016-04-30 16:53:27.897: [ora.asm][1077958976]{0:0:2} [check] clsnUtils::error Exception type=2 string=
2016-04-30 16:53:27.897: [ora.asm][1077958976]{0:0:2} [check] AsmAgent::checkCbk: Exception UserErrorException
2016-04-30 16:53:27.897: [ora.asm][1077958976]{0:0:2} [check]  
2016-04-30 16:53:27.897: [ora.asm][1077958976]{0:0:2} [check] InstAgent::check 1 prev clsagfw_res_status 4 current clsagfw_res_status 4
2016-04-30 16:53:28.900: [ora.asm][1077958976]{0:0:2} [check] AsmAgent::check ocrCheck 1 m_OcrOnline 0 m_OcrTimer 162
2016-04-30 16:53:28.900: [ora.asm][1077958976]{0:0:2} [check] CrsCmd::ClscrsCmdData::stat entity 5 statflag 32 useFilter 1
2016-04-30 16:53:28.901: [ COMMCRS][1133730112]clsc_connect: (0x1f159cb0) no listener at (ADDRESS=(PROTOCOL=IPC)(KEY=CRSD_UI_SOCKET))
2016-04-30 16:53:28.901: [ora.asm][1077958976]{0:0:2} [check] checkCrsStat 2 CLSCRS_STAT ret: 184
2016-04-30 16:53:28.901: [ora.asm][1077958976]{0:0:2} [check] clsnUtils::error Exception type=2 string=2016-04-30 16:53:2

logged into ASM instance on second Node and  query the disk status, it showed diskgroups is Dismounted. It clearly shows that DISKGROUP is not mounted on second node of the Cluster 
######
SQL> select GROUP_NUMBER,NAME,OFFLINE_DISKS, state from v$asm_diskgroup;
GROUP_NUMBER NAME     OFFLINE_DISKS STATE
------------ ------------------------------ ------------- -------------
  0                              GRID            0                       DISMOUNTED

[root@tnc2 bin]# ./crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   357fa62799a74fcdbf518b0f0d6d9a29 (/dev/raw/raw1) [GRID]
 2. OFFLINE  216d1cb038d14fc1bf8ccc05b0af6793 () []
 3. ONLINE   061664871d014ff7bfac218182629a7d (/dev/raw/raw3) [GRID]
Located 3 voting disk(s).

Continued to install & Successfully Installed the GRID cluster : CVU is failed at end of grid installation, I found that there is no permission on disks on second node  of the cluster 
Later worked with SA by setting Udev Rules on the disks attached to Server 
stopped crs force on node of the cluster and After restarted the crs on both nodes of the cluster ##

[root@tnc2 ~]# crsctl stop crs -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'tnc2'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'tnc2'
CRS-2673: Attempting to stop 'ora.crf' on 'tnc2'
CRS-2673: Attempting to stop 'ora.ctssd' on 'tnc2'
CRS-2673: Attempting to stop 'ora.evmd' on 'tnc2'
CRS-2673: Attempting to stop 'ora.asm' on 'tnc2'
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'tnc2'
CRS-2677: Stop of 'ora.evmd' on 'tnc2' succeeded
CRS-2677: Stop of 'ora.mdnsd' on 'tnc2' succeeded
CRS-2677: Stop of 'ora.crf' on 'tnc2' succeeded
CRS-2677: Stop of 'ora.asm' on 'tnc2' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'tnc2'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'tnc2' succeeded
CRS-2677: Stop of 'ora.drivers.acfs' on 'tnc2' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'tnc2' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'tnc2'
CRS-2677: Stop of 'ora.cssd' on 'tnc2' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'tnc2'
CRS-2677: Stop of 'ora.gipcd' on 'tnc2' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'tnc2'
CRS-2677: Stop of 'ora.gpnpd' on 'tnc2' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'tnc2' has completed
CRS-4133: Oracle High Availability Services has been stopped.

# After #
# crsctl start crs

## Issue is resolved ##

--Nikhil Tatineni--
--RAC -- 

Querys to monitor RAC

following few  Query's will help to find out culprits-  Query to check long running transaction from last 8 hours  Col Sid Fo...