Voting Files are stored on ASM disk from 11GR2. Place where Clusterware stores Voting files called voting disk. Oracle Clusterware Software creates multiple Votingfiles during installation to avoid voting disk failures. If we Install Oracle Clusterware software with External redundancy we will get 1 OCR and VOTING FILES on Diskgroup; If we check, Normal Redundancy during Installation, Oracle Clusterware Software require 3 disks for a diskgroup. Oracle Clusterware software creates Original Copy and Mirror copy of OCR inside diskgroup and 3 Voting files on diskgroup i.e 1 voting file per each disk in a diskgroup. If we Check High redundancy during installation 3 copy's of OCRS | 5 Voting Files on 5 disks inside the diskgroup; Clusterware maintain Voting disk to know heartbeat of the nodes in a cluster. When ever node joins a cluster, voting file updated. When ever voting file is updated, Clusterware software starts communicating with node.i.e CRS demon on one node will communicate with CRS demon on another node. Then On each node, clusterware software starts its services and resources. Technically CSSD monitor will update Voting files on each disk inside the diskgroup. For every second CSSD process updates (Node availability information) to the voting files. Atleast each node has to access 2/3 voting files. If any node is unable to access 2/3 of voting files, it evicts from cluster. OPROCD kills the node in a cluster to protect data integrity. From 11GR2, OCR will backup the votingfiles. If votingfiles are corrupted, we can recover using backup taken by OCR. We know that Votingfiles are stored on ASM, we can find Out using Following Query
Currently we have only 1 Voting file on the cluster. If this Voting file is corrupted, All nodes in a cluster will evict from the cluster.
To Stabilize RAC Environment, Multiplex Voting Files in Oracle RAC.
To Multiplex Voting files, Create a new diskgroup using "ASMCA or SYSAM" with NORMAL Redundancy or High redundancy and Replace Voting files from old diskgroup to New diskgroup;
OLDDISKGROUP: NEVOT
NEWDISKGROUP: GRID
Make sure, New disk group is mounted on all instances
SQL> select name,group_number,state from v$asm_diskgroup;
--- Replace Voting files from One OLD ASM diskgroup (External Redundancy) to another NEW ASM diskgroup (High Redundancy). perform operation as root user from any node of the cluster
Successfully, Voting files are Replaced / Multiplexed on Oracle RAC Cluster
From 11gr2 OCR files (Oracle cluster registry) and voting files are stored on ASM. When we Install Oracle Cluster-ware software by selecting Normal redundancy, we allocate at-least 3 disks to store OCR and Voting files on ASM. Oracle Clusterware software will create diskgroup using 3 disks and places cluster-ware files OCR and Voting files on ASM disk-group. When we go with normal redundancy cluster-ware software Places Original OCR and Mirror of OCR in same disk-group & Place voting file on each disk inside the disk-group. OCR stores all cluster-ware information, if the disk-group which is carrying OCR on cluster crashes, we loose both Original and Mirror copy of OCR. (Anyway, we take backup of OCR using ocrconfig, In that scenario, we need to restore both OCR Files / need downtime to recover OCR from backup) To stabilize the RAC Environment we mirror copy of OCR to newly created diskgroup;
Pre-req's to add a mirror copy on another diskgroup; a) Make Sure new diskgroup is mounted on all instances b) Database and diskgroup compatibility is greater than 11.2 when we ignore pre-req's, we will encounter with following error [root@tnc1 ~]# ocrconfig -add +ocnew PROT-30: The Oracle Cluster Registry location to be added is not usable PROC-50: The Oracle Cluster Registry location to be added is inaccessible on nodes tnc1. failed ——
Before — col COMPATIBILITY form a15 col DATABASE_COMPATIBILITY form a20 col NAME form a20 select group_number, name, compatibility, database_compatibility from v$asm_diskgroup;
GROUP_NUMBER NAME COMPATIBILITY DATABASE_COMPATIBILITY ------------ -------------------- --------------- -------------------- 4 OCNEW 12.1.0.0.0 10.1.0.0.0 1 FRA 12.1.0.0.0 10.1.0.0.0 2 GRID 12.1.0.0.0 10.1.0.0.0 3 SDXH 12.1.0.0.0 10.1.0.0.0 changing DATABASE COMPATIBILITY of NEW disk-group "OCRNEW"
SQL> alter diskgroup ocnew set attribute 'compatible.asm'='12.1'; Diskgroup altered. SQL> alter diskgroup ocnew set attribute 'compatible.rdbms'='12.1'; Diskgroup altered.
-After —
col COMPATIBILITY form a15 col DATABASE_COMPATIBILITY form a20 col NAME form a20 select group_number, name, compatibility, database_compatibility from v$asm_diskgroup;
Before —Original Copy and Mirror copy is on same diskgroup "+GRID" [root@tnc1 ~]# grid_env [root@tnc1 ~]# [root@tnc1 ~]# ocrcheck Status of Oracle Cluster Registry is as follows : Version : 4 Total space (kbytes) : 409568 Used space (kbytes) : 1716 Available space (kbytes) : 407852 ID : 143829273 Device/File Name : +GRID Device/File integrity check succeeded Device/File not configured Device/File not configured Device/File not configured Device/File not configured Cluster registry integrity check succeeded Logical corruption check succeeded Placing copy of OCR on another disk group as follows After Meeting all pre-req's, add mirror copy of OCR into new diskgroup "OCNEW" as follows
—Perform Operation as root [root@tnc1 ~]# ocrconfig -add +ocnew CRS log as follows 2016-02-25 00:57:02.743 [OCTSSD(5273)]CRS-2408: The clock on host tnc1 has been updated by the Cluster Time Synchronization Service to be synchronous with the mean cluster time. 2016-02-25 01:24:23.365 [CRSD(5837)]CRS-1007: The OCR/OCR mirror location was replaced by +OCNEW/tnc-cluster/OCRFILE/registry.255.904699441. After — [root@tnc1 ~]# ocrcheck Status of Oracle Cluster Registry is as follows : Version : 4 Total space (kbytes) : 409568 Used space (kbytes) : 1716 Available space (kbytes) : 407852 ID : 143829273 Device/File Name : +GRID Device/File integrity check succeeded Device/File Name : +OCNEW Device/File integrity check succeeded Device/File not configured Device/File not configured Device/File not configured Cluster registry integrity check succeeded Logical corruption check succeeded Finally Mirror copy is placed on Different diskgroup Location of OCR
[oracle@tnc2 ~]$ cat /etc/oracle/ocr.loc
#Device/file getting replaced by device +OCNEW/tnc-cluster/OCRFILE/registry.255.904699441
This is the one of the Important step to stabilize your RAC environment. At least we have to maintain 3 Online redolog groups for each Instance and each group consists of 2 members each & each member in different diskgroups
[oracle@tnc1 ~]$ db
[oracle@tnc1 ~]$ sql
SQL*Plus: Release 12.1.0.2.0 Production on Wed Feb 24 00:09:06 2016
Copyright (c) 1982, 2014, Oacle. All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options
script # Use following Query to get size of the redolog file GROUPS #
SQL>
set echo off
set feedback off
set linesize 120
set pagesize 35
set trim on
set trims on
set lines 120
col group# format 999
col thread# format 999
col member format a50 wrap
col status format a10
col archived format a10
col fsize format 999 heading "Size (MB)"
select l.group#, l.thread#,
f.member,
l.archived,
l.status,
(bytes/1024/1024) fsize
from
v$log l, v$logfile f
where f.group# = l.group#
order by 1,2
/
output #
Currently On RAC database, IF YOU have 2 instances / threads and each thread have only 2 redolog groups / 2 members on different diskgroups.At the same time, we are increasing size of the redolog files and adding additional redologfile group for each instance/thread as follows
SQL> alter database add logfile thread 1 group 5 ('+FRA/P1TNCD/ONLINELOG/redo05.log','+SDXH/P1TNCD/ONLINELOG/redo05.log') size 100m;
SQL> alter database add logfile thread 1 group 6 ('+FRA/P1TNCD/ONLINELOG/redo06.log','+SDXH/P1TNCD/ONLINELOG/redo06.log') size 100m;
SQL> alter database add logfile thread 1 group 7 ('+FRA/P1TNCD/ONLINELOG/redo07.log','+SDXH/P1TNCD/ONLINELOG/redo07.log') size 100m;
SQL> alter database add logfile thread 2 group 8 ('+FRA/P1TNCD/ONLINELOG/redo08.log','+SDXH/P1TNCD/ONLINELOG/redo08.log') size 100m;
SQL> alter database add logfile thread 2 group 9 ('+FRA/P1TNCD/ONLINELOG/redo09.log','+SDXH/P1TNCD/ONLINELOG/redo09.log') size 100m;
SQL> alter database add logfile thread 2 group 10 ('+FRA/P1TNCD/ONLINELOG/redo10.log','+SDXH/P1TNCD/ONLINELOG/redo10.log') size 100m;
Make checkpoint global and make status of online redolog groups are "INACTIVE" and drop them on database immediately
Applying Rolling Patch "21948354" on RAC ORACLE_HOME's
a) On Node1: Stop CRS on Node 1, which failover all services running from Node 1 to Node 2. Make sure management database is Failed Over to Node 2 #crsctl stop crs b) Update OPatch Version if it is less than 12.1.0.1.7 by using patch 6880880 by selecting the 12.1.0.1.0 release c) Set appropriate environment variables
export ORACLE_BASE=/u01/app/oracle
export DB_HOME=$ORACLE_BASE/product/12.1.0.2/db_1
export PATH=$ORACLE_HOME/OPatch:$PATH:.
d) Take Backup Of ORACLE_HOME $ cp -r db_1 db_1_backup e) Check for previous patches conflicts with New Rolling PSU Patch "21948354" $ cd 21948354 $ opatch prereq CheckConflictAgainstOHWithDetail -ph ./
f) When ever above script completed without any conflicts go ahead and apply rolling patch
$ opatch apply -local
g) When Patching completed on ORACLE_HOME, go ahead and bringup CRS on NODE1 and make sure all services are running on NODE 1
Follow the same steps and Apply patch on Node 2
After applying Patch on both ORACLE_HOMES / Nodes of a cluster, make sure all resources are up and running
When we set "fast_start_mttr_target" parameter MTTR Advisor and Redolog file size advisor is also enabled on database Use following Query to check size of the Redolog group & location on database set linesize 300 column REDOLOG_FILE_NAME format a50 SELECT a.GROUP#, a.THREAD#, a.SEQUENCE#, a.ARCHIVED, a.STATUS, b.MEMBER AS REDOLOG_FILE_NAME, (a.BYTES/1024/1024) AS SIZE_MB FROM v$log a JOIN v$logfile b ON a.Group#=b.Group# ORDER BY a.GROUP# ASC; GROUP# THREAD# SEQUENCE# ARC STATUS REDOLOG_FILE_NAME SIZE_MB ---------- ---------- ---------- --- ---------------- --------------------------------- 1 1 29 NO INACTIVE +FRA/P1TNCD/ONLINELOG/group_1.257.903695363 50 1 1 29 NO INACTIVE +SDXH/P1TNCD/ONLINELOG/group_1.262.903695363 50 2 1 30 NO CURRENT +FRA/P1TNCD/ONLINELOG/group_2.258.903695363 50 2 1 30 NO CURRENT +SDXH/P1TNCD/ONLINELOG/group_2.263.903695363 50 3 2 23 NO INACTIVE +SDXH/P1TNCD/ONLINELOG/group_3.269.903695731 50 3 2 23 NO INACTIVE +FRA/P1TNCD/ONLINELOG/group_3.259.903695731 50 4 2 24 NO CURRENT +SDXH/P1TNCD/ONLINELOG/group_4.270.903695731 50 4 2 24 NO CURRENT +FRA/P1TNCD/ONLINELOG/group_4.260.903695731 50
when you look at column "v$instance_recovery.OPTIMAL_LOGFILE_SIZE" this gives optimal size of the redolog on current workload on database. desc v$instance_recovery; SQL> select OPTIMAL_LOGFILE_SIZE, TARGET_MTTR from v$instance_recovery; OPTIMAL_LOGFILE_SIZE TARGET_MTTR -------------------- ----------- 236 13 For current database it is suggested that optimal size of the redolog's is 236 MB. current size of the redolog on database is 50 MB drop and recreate to 236 MB, which is recommended by redolog size advisory #### select Start_Date, Start_Time, Num_Logs, Round(Num_Logs * (Vl.Bytes / (1024 * 1024)), 2) AS Mbytes, Vdb.NAME AS Dbname FROM (SELECT To_Char(Vlh.First_Time, 'YYYY-MM-DD') AS Start_Date, To_Char(Vlh.First_Time, 'HH24') || ':00' AS Start_Time, COUNT(Vlh.Thread#) Num_Logs FROM V$log_History Vlh GROUP BY To_Char(Vlh.First_Time, 'YYYY-MM-DD'), To_Char(Vlh.First_Time, 'HH24') || ':00') Log_Hist, V$log Vl, V$database Vdb WHERE Vl.Group# = 1 ORDER BY Log_Hist.Start_Date, Log_Hist.Start_Time;
we are setting "fast_start_mttr_target" to set limit instance recovery time
"log_checkpoint_timeout" should be greater than "fast_start_mttr_target" "log_checkpoint_timeout" is the maximum amount of time that dirty buffer can stay in memory Don't forget that longer and longer checkpoint time takes longer time to perform instance recover and ViceVersa Faster checkpoints gives overhead to database performance. While tuning, make set Optimal time for " log_checkpoint_timeout" on database But Oracle Recommend us to set "log_checkpoint_interval" than parameter " log_checkpoint_timeout"