Friday, September 2, 2016

rootupgrade.sh failed on first node (11.2.0.4 to 12.1.0.2)


scenario # upgrading GRID infrastructure from 11.2.0.4 to 12.1.0.2 and rootupgrade.sh failed on first node 
 
[root@tnc1 ~]# /u01/app/12.1.0.2/grid/rootupgrade.sh
The following environment variables are set as:
    ORACLE_OWNER= oracle
    ORACLE_HOME=  /u01/app/12.1.0.2/grid

Enter the full pathname of the local bin directory: [/usr/local/bin]:
The file "dbhome" already exists in /usr/local/bin.  Overwrite it? (y/n)
[n]:
The file "oraenv" already exists in /usr/local/bin.  Overwrite it? (y/n)
[n]:
The file "coraenv" already exists in /usr/local/bin.  Overwrite it? (y/n)
[n]:

Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /u01/app/12.1.0.2/grid/crs/install/crsconfig_params
2016/09/02 16:52:10 CLSRSC-4015: Performing install or upgrade action for Oracle Trace File Analyzer (TFA) Collector.

2016/09/02 16:52:12 CLSRSC-4003: Successfully patched Oracle Trace File Analyzer (TFA) Collector.

2016/09/02 16:52:17 CLSRSC-464: Starting retrieval of the cluster configuration data
2016/09/02 16:52:33 CLSRSC-465: Retrieval of the cluster configuration data has successfully completed.
2016/09/02 16:52:33 CLSRSC-363: User ignored prerequisites during installation
2016/09/02 16:52:50 CLSRSC-515: Starting OCR manual backup.
2016/09/02 16:52:54 CLSRSC-516: OCR manual backup successful.
2016/09/02 16:53:03 CLSRSC-468: Setting Oracle Clusterware and ASM to rolling migration mode

2016/09/02 16:53:03 CLSRSC-482: Running command: '/u01/app/12.1.0.2/grid/bin/asmca -silent -upgradeNodeASM -nonRolling false -oldCRSHome /u01/app/11.2.0.4/grid -oldCRSVersion 11.2.0.4.0 -nodeNumber 1 -firstNode true -startRolling true'
ASM configuration upgraded in local node successfully.


2016/09/02 16:53:15 CLSRSC-469: Successfully set Oracle Clusterware and ASM to rolling migration mode
2016/09/02 16:53:15 CLSRSC-466: Starting shutdown of the current Oracle Grid Infrastructure stack
2016/09/02 16:53:49 CLSRSC-467: Shutdown of the current Oracle Grid Infrastructure stack has successfully completed.
OLR initialization - successful
2016/09/02 16:57:07 CLSRSC-329: Replacing Clusterware entries in file '/etc/inittab'

CRS-4133: Oracle High Availability Services has been stopped.
CRS-4123: Oracle High Availability Services has been started.
2016/09/02 17:01:43 CLSRSC-115: Start of resource 'ora.asm' failed

2016/09/02 17:01:43 CLSRSC-117: Failed to start Oracle Clusterware stack
2016/09/02 17:01:43 CLSRSC-247: Failed to start ASM

Died at /u01/app/12.1.0.2/grid/crs/install/crsupgrade.pm line 967.
The command '/u01/app/12.1.0.2/grid/perl/bin/perl -I/u01/app/12.1.0.2/grid/perl/lib -I/u01/app/12.1.0.2/grid/crs/install /u01/app/12.1.0.2/grid/crs/install/rootcrs.pl  -upgrade' execution failed


 Environment looks like as follows on failed node
[root@tnc1 trace]# ps -ef | grep ohasd
root      8333     1  0 16:57 ?        00:00:00 /bin/sh /etc/init.d/init.ohasd run
root      9554     1  0 16:58 ?        00:00:05 /u01/app/12.1.0.2/grid/bin/ohasd.bin exclusive
root     20043 18187  0 17:24 pts/2    00:00:00 grep ohasd

[root@tnc1 trace]# ps -elf | egrep "PID|d.bin|ohas|oraagent|orarootagent|cssdagent|cssdmonitor" | grep -v grep


F S UID        PID  PPID  C PRI  NI ADDR SZ WCHAN  STIME TTY          TIME CMD
4 S root      8333     1  0  78   0 -  2704 pipe_w 16:57 ?        00:00:00 /bin/sh /etc/init.d/init.ohasd run
4 S root      9554     1  0  75   0 - 81343 futex_ 16:58 ?        00:00:05 /u01/app/12.1.0.2/grid/bin/ohasd.bin exclusive
4 S oracle   11044     1  0  75   0 - 68065 futex_ 16:59 ?        00:00:00 /u01/app/12.1.0.2/grid/bin/oraagent.bin
0 S oracle   11058     1  0  75   0 - 36257 429496 16:59 ?        00:00:00 /u01/app/12.1.0.2/grid/bin/evmd.bin
0 S oracle   11086     1  0  75   0 - 33771 -      16:59 ?        00:00:00 /u01/app/12.1.0.2/grid/bin/mdnsd.bin
0 S oracle   11096 11058  0  75   0 - 50643 923257 16:59 ?        00:00:00 /u01/app/12.1.0.2/grid/bin/evmlogger.bin -o /u01/app/12.1.0.2/grid/log/[HOSTNAME]/evmd/evmlogger.info -l /u01/app/12.1.0.2/grid/log/[HOSTNAME]/evmd/evmlogger.log
0 S oracle   11113     1  0  75   0 - 39287 -      16:59 ?        00:00:00 /u01/app/12.1.0.2/grid/bin/gpnpd.bin
4 S root     11174     1  0 -40   - - 41241 futex_ 16:59 ?        00:00:00 /u01/app/12.1.0.2/grid/bin/cssdmonitor
0 S oracle   11176     1  0  75   0 - 42690 -      16:59 ?        00:00:01 /u01/app/12.1.0.2/grid/bin/gipcd.bin
4 S root     11194     1  0 -40   - - 41385 futex_ 16:59 ?        00:00:00 /u01/app/12.1.0.2/grid/bin/cssdagent
4 S oracle   11223     1  0 -40   - - 56853 futex_ 16:59 ?        00:00:03 /u01/app/12.1.0.2/grid/bin/ocssd.bin
4 S root     11430     1  0  75   0 - 50513 futex_ 17:00 ?        00:00:00 /u01/app/12.1.0.2/grid/bin/orarootagent.bin
4 S root     11443     1  0  78   0 - 40670 futex_ 17:00 ?        00:00:00 /u01/app/12.1.0.2/grid/bin/octssd.bin reboot
4 S root     11469     1  0 -40   - - 57823 -      17:00 ?        00:00:05 /u01/app/12.1.0.2/grid/bin/osysmond.bin
4 S root     11500     1  0 -40   - - 72193 923257 17:00 ?        00:00:00 /u01/app/12.1.0.2/grid/bin/ologgerd -M -d /u01/app/12.1.0.2/grid/crf/db/tnc1


[root@tnc1 bin]# ./crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4529: Cluster Synchronization Services is online
CRS-4534: Cannot communicate with Event Manager

[root@tnc1 bin]# cat /etc/oracle/olr.loc
olrconfig_loc=/u01/app/12.1.0.2/grid/cdata/tnc1.olr
crs_home=/u01/app/12.1.0.2/grid


 Root cause #
/dev/shm/ to store temporary storage files on server
when runs out of memory / rootupgrade.sh fails on server #

Resolution #
Increase /dev/shm/ space to resolve this issue # 

[root@tnc1 bin]# cat /etc/fstab
/dev/VolGroup00/LogVol00 /                ext3    defaults        1 1
LABEL=/boot             /boot                  ext3    defaults        1 2
tmpfs                   /dev/shm                    tmpfs   defaults        0 0
devpts                  /dev/pts                     devpts  gid=5,mode=620  0 0
sysfs                   /sys                             sysfs   defaults        0 0
proc                    /proc                           proc    defaults        0 0
/dev/VolGroup00/LogVol01 swap        swap    defaults        0 0
tmpfs  /dev/shm  tmpfs  defaults,size=3G  0 0

  
Downgrade again from 12.1.0.2 to 11.2.0.4 

[root@tnc1 install]# ./rootcrs.sh -downgrade -force
Using configuration parameter file: /u01/app/12.1.0.2/grid/crs/install/crsconfig_params
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'tnc1'
CRS-2673: Attempting to stop 'ora.crf' on 'tnc1'
CRS-2673: Attempting to stop 'ora.ctssd' on 'tnc1'
CRS-2673: Attempting to stop 'ora.evmd' on 'tnc1'
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'tnc1'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'tnc1'
CRS-2673: Attempting to stop 'ora.gpnpd' on 'tnc1'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'tnc1' succeeded
CRS-2677: Stop of 'ora.crf' on 'tnc1' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'tnc1' succeeded
CRS-2677: Stop of 'ora.gpnpd' on 'tnc1' succeeded
CRS-2677: Stop of 'ora.evmd' on 'tnc1' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'tnc1'
CRS-2677: Stop of 'ora.mdnsd' on 'tnc1' succeeded
CRS-2677: Stop of 'ora.cssd' on 'tnc1' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'tnc1'
CRS-2677: Stop of 'ora.gipcd' on 'tnc1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'tnc1' has completed
CRS-4133: Oracle High Availability Services has been stopped.
2016/09/02 17:32:44 CLSRSC-4001: Installing Oracle Trace File Analyzer (TFA) Collector.
2016/09/02 17:32:44 CLSRSC-4002: Successfully installed Oracle Trace File Analyzer (TFA) Collector.
Successfully downgraded Oracle Clusterware stack on this node

[root@tnc1 install]# cat /etc/oracle/olr.loc
olrconfig_loc=/u01/app/11.2.0.4/grid/cdata/tnc1.olr
crs_home=/u01/app/11.2.0.4/grid
[root@tnc1 install]#  

 ------
Next Step # RERUN rootupgrade.sh  from 12.1.0.2 again ...
 Output as follows #

[root@tnc1 bin]# /u01/app/12.1.0.2/grid/rootupgrade.sh
Performing root user operation.

The following environment variables are set as:
    ORACLE_OWNER= oracle
    ORACLE_HOME=  /u01/app/12.1.0.2/grid
Enter the full pathname of the local bin directory: [/usr/local/bin]:
The file "dbhome" already exists in /usr/local/bin.  Overwrite it? (y/n)
[n]:
The file "oraenv" already exists in /usr/local/bin.  Overwrite it? (y/n)
[n]:
The file "coraenv" already exists in /usr/local/bin.  Overwrite it? (y/n)
[n]:

Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /u01/app/12.1.0.2/grid/crs/install/crsconfig_params
2016/09/02 17:43:25 CLSRSC-4015: Performing install or upgrade action for Oracle Trace File Analyzer (TFA) Collector.

2016/09/02 17:43:25 CLSRSC-4003: Successfully patched Oracle Trace File Analyzer (TFA) Collector.
2016/09/02 17:43:31 CLSRSC-464: Starting retrieval of the cluster configuration data
2016/09/02 17:43:42 CLSRSC-465: Retrieval of the cluster configuration data has successfully completed.

2016/09/02 17:43:42 CLSRSC-363: User ignored prerequisites during installation
2016/09/02 17:44:04 CLSRSC-515: Starting OCR manual backup.
2016/09/02 17:44:08 CLSRSC-516: OCR manual backup successful.
2016/09/02 17:44:15 CLSRSC-468: Setting Oracle Clusterware and ASM to rolling migration mode


2016/09/02 17:44:15 CLSRSC-482: Running command: '/u01/app/12.1.0.2/grid/bin/asmca -silent -upgradeNodeASM -nonRolling false -oldCRSHome /u01/app/11.2.0.4/grid -oldCRSVersion 11.2.0.4.0 -nodeNumber 1 -firstNode true -startRolling true'

ASM configuration upgraded in local node successfully.
2016/09/02 17:44:21 CLSRSC-469: Successfully set Oracle Clusterware and ASM to rolling migration mode
2016/09/02 17:44:21 CLSRSC-466: Starting shutdown of the current Oracle Grid Infrastructure stack
2016/09/02 17:44:56 CLSRSC-467: Shutdown of the current Oracle Grid Infrastructure stack has successfully completed.
OLR initialization - successful
2016/09/02 17:48:16 CLSRSC-329: Replacing Clusterware entries in file '/etc/inittab'

CRS-4133: Oracle High Availability Services has been stopped.
CRS-4123: Oracle High Availability Services has been started.
2016/09/02 17:52:03 CLSRSC-472: Attempting to export the OCR

2016/09/02 17:52:03 CLSRSC-482: Running command: 'ocrconfig -upgrade oracle oinstall'
2016/09/02 17:52:31 CLSRSC-473: Successfully exported the OCR

2016/09/02 17:52:38 CLSRSC-486:
 At this stage of upgrade, the OCR has changed.
 Any attempt to downgrade the cluster after this point will require a complete cluster outage to restore the OCR.

2016/09/02 17:52:38 CLSRSC-541:
 To downgrade the cluster:
 1. All nodes that have been upgraded must be downgraded.

2016/09/02 17:52:38 CLSRSC-542:
 2. Before downgrading the last node, the Grid Infrastructure stack on all other cluster nodes must be down.

2016/09/02 17:52:38 CLSRSC-543:
 3. The downgrade command must be run on the node tnc2 with the '-lastnode' option to restore global configuration data.

2016/09/02 17:53:06 CLSRSC-343: Successfully started Oracle Clusterware stack
clscfg: EXISTING configuration version 5 detected.
clscfg: version 5 is 11g Release 2.

Successfully taken the backup of node specific configuration in OCR.
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
2016/09/02 17:53:49 CLSRSC-474: Initiating upgrade of resource types

2016/09/02 17:54:17 CLSRSC-482: Running command: 'upgrade model  -s 11.2.0.4.0 -d 12.1.0.2.0 -p first'

2016/09/02 17:54:17 CLSRSC-475: Upgrade of resource types successfully initiated.
2016/09/02 17:54:25 CLSRSC-325: Configure Oracle Grid Infrastructure for a Cluster ... succeeded 


--Nikhil Tatineni--
--Oracle 12c -- 

Reference #
http://www.golinuxhub.com/2012/09/how-to-fix-ora-00845-memorytarget-not.html

Querys to monitor RAC

following few  Query's will help to find out culprits-  Query to check long running transaction from last 8 hours  Col Sid Fo...