Thursday, February 27, 2014

Removing a Node from the Cluster

Removing a Node from the Cluster

A series of steps is used to remove a node.

  • You cannot simply remove the node from the cluster.
  • Oracle Central Inventory on each node has information about all nodes.
  • The Oracle Cluster Registry (OCR) contains information about all nodes.


On each node in the cluster, Oracle Central Inventory on that node contains information about all the nodes of the cluster. The binary OCR and voting disk also contain information about each node of the cluster. Therefore, to remove a node from the cluster properly, a procedure must be followed. You cannot simply remove the node. The procedure to remove a node from the cluster involves running a set of steps. 


Deleting a node from the cluster is a multiple-step process. Some commands are run on the node to be deleted and other commands are run on an existing node of the cluster. Some commands are run by the root user and other commands are run by the Oracle Clusterware software owner’s account.

When passing arguments to a command, sometimes the existing node is passed, sometimes the node to be removed is passed, and at other times a complete list of remaining nodes is passed as an argument. This requires special attention to detail to avoid making mistakes during the process.


Deleting a Node from the Cluster

Here in this example we have 3-Nodes Oracle RAC Cluster and we want to delete node host03



1.    Verify the location of the Oracle Clusterware home
2.From a node that will remain, run the following as root to expire the Cluster Synchronization Service (CSS) lease on the node that you are deleting:

[root@host01]# crsctl unpin css -n host03

The crsctl unpin command will fail if Cluster Synchronization Services (CSS) is not running on the node being deleted. Run the olsnodes –s –t command to show whether the node is active or pinned. If the node is not pinned, go to step 3.

3. Run the rootcrs.pl script as root from the Grid_home/crs/install directory on each node to be deleted:

[root@host03]# ./rootcrs.pl -delete -force

Note: This procedure assumes that the node to be removed can be accessed. If you cannot execute commands on the node to be removed, you must manually stop and remove the VIP resource using the following commands as root from any node that you are not deleting:

# srvctl stop vip -i vip_name -f

# srvctl remove vip -i vip_name -f


where vip_name is the Virtual IP (VIP) for the node to be deleted.

4. From a node that will remain, delete the node from the cluster with the following command run as root:

[root@host01]# crsctl delete node -n host03

5. On the node to be deleted, as the user that installed Oracle Clusterware, run the following command from the Grid_home/oui/bin directory:

[grid@host03]$ ./runInstaller -updateNodeList ORACLE_HOME=Grid_home "CLUSTER_NODES={host03}" CRS=TRUE -local

6. On the node that you are deleting, run the runInstaller command as the user that installed Oracle Clusterware.

A. If you have a shared home:

[grid@host03]$ ./runInstaller -detachHome ORACLE_HOME=/u01/app/11.2.0/grid
B. For a nonshared home, deinstall the Oracle Clusterware home:

[grid@host03]# ./deinstall –local

7. On any remaining node, as the Grid software owner, update the node list:

[grid@host01]$ cd /Grid_home/oui/bin

[grid@host01]$ ./runInstaller -updateNodeList \

ORACLE_HOME=/u01/app/11.2.0/grid \

"CLUSTER_NODES={host01,host02}" CRS=TRUE


8. On any remaining node, verify that the specified nodes have been deleted from the cluster.

[grid@host01]$ cluvfy stage -post nodedel –n host03 [-verbose]


Deleting a Node from a Cluster (GNS in Use)

If your cluster uses GNS, do the following (steps from the previous procedure):

3. Run the rootcrs.pl script as root from the Grid_home/crs/install directory on each node to be deleted.
4. From a node that will remain, delete the node from the cluster.
7. On any remaining node as the Grid software owner, update the node list.
8. On any remaining node perform step 8



Deleting a Node from Cluster
------------------------------------------------------------------
(Node 1) # /u01/app/11.2.0/grid/bin/crsctl unpin css -n rac2
(Node 1) # /u01/app/11.2.0/grid/bin/olsnodes -s -t
(Node 2) # cd /u01/app/11.2.0/grid/crs/install
(Node 2) # ./rootcrs.pl -delete -force
(Node 1) # /u01/app/11.2.0/grid/bin/crsctl delete node -n rac2

(Node 2) $ cd /u01/app/11.2.0/grid/oui/bin
(Node 2) $ ./runInstaller -updateNodeList ORACLE_HOME=/u01/app/11.2.0/grid "CLUSTER_NODES={rac2}" CRS=TRUE -local
(Node 2) $ ./runInstaller -detachHome ORACLE_HOME=/u01/app/11.2.0/grid


(Node 1) $ cd /u01/app/11.2.0/grid/oui/bin
(Node 1) $ ./runInstaller -updateNodeList ORACLE_HOME=/u01/app/11.2.0/grid "CLUSTER_NODES={rac1}" CRS=TRUE
(Node 1) $ cluvfy stage -post nodedel -n rac2 [-verbose]

Adding a node to cluster

Following method only add the clusterware to the new node and not the oracle instance.
The addNode.sh shell script is used to add nodes to an existing Oracle Clusterware environment. It:
  • Runs without a graphical interface
  • Does not perform the prerequisite operating system tasks 
You can use a variety of methods to add and delete nodes in an Oracle Clusterware environment:
  • Silent cloning procedures: Copy images of an Oracle Clusterware installation to other nodes to create new clusters that have identical hardware by using the clone.pl script.
  • Enterprise Manager (EM) Grid Control: Provides a GUI interface and automated wizards to the cloning procedures
  • addNode.sh: Invokes a subset of OUI functionality



Special attention must be given to the procedures because some steps are performed on the existing nodes, whereas other steps are performed on the nodes that are being added or removed.

Prerequisite Steps for Running addNode.sh

  • Make physical connections: networking, storage, and other
  • Install the operating system.
  • Perform the Oracle Clusterware installation prerequisite tasks:
    • Check system requirements.
    • Check network requirements.
    • Install the required operating system packages.
    • Set kernel parameters.
    • Create groups and users.
    • Create the required directories.
    • Configure the installation owner’s shell limits.
    • Configure Secure Shell (SSH) and enable user equivalency.
  • Verify the installation with Cluster Verify utility (cluvfy) from existing nodes.
    • Perform a post-hardware and operating system check.
      • [grid@host01]$ cluvfy stage –post hwos –n host03
    • Perform a detailed properties comparison of one existing reference node to the new node.
      • grid@host01]$ cluvfy comp peer -refnode host01 -n host03 -orainv oinstall -osdba asmdba -verbose
Adding a Node with addNode.sh
  • Ensure that Oracle Clusterware is successfully installed on at least one node.
  • Verify the integrity of the cluster and the node to be added (host03) with:
    • [grid@host01]$cluvfy stage -pre nodeadd -n host03 
  • Run addNode.sh to add host03 to an existing cluster.
    • Without GNS:
      • [grid@host01]$cd /Grid_home/oui/bin
[grid@host01]$./addNode.sh –silent "CLUSTER_NEW_NODES={host03}" \
"CLUSTER_NEW_VIRTUAL_HOSTNAMES={host03-vip}"

    • With GNS:
      • $ ./addNode.sh –silent "CLUSTER_NEW_NODES={node3}" 
  • Perform integrity checks on the cluster.
    • [grid@host01]$ cluvfy stage –post nodeadd –n host03 -verbose


Practice
-------------------------------


Adding a node to clusterware
-----------------------------

$ cluvfy stage -post hwos -n rac2

$ cluvfy comp peer -refnode rac1 -n rac2 -orainv oinstall -osdba asmdba -verbose

Adding a Node with addNode.sh
---------------------------------
$ cluvfy stage -pre nodeadd -n rac2
$ cd /u01/app/11.2.0/grid/oui/bin
$ ./addNode.sh -silent "CLUSTER_NEW_NODES={rac2}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={rac2-vip}"
$ cluvfy stage -post nodeadd -n rac2 -verbose



Download the Video


Wednesday, February 26, 2014

Administering Oracle Clusterware

Administering Oracle Clusterware Administering Oracle Clusterware

Backing Up and Recovering the Voting Disk

  • In Oracle Clusterware 11g Release 2, voting disk data is backed up automatically in the OCR as part of any configuration change.
  • Voting disk data is automatically restored to any added voting disks.
  • To add or remove voting disks on non–Automatic Storage Management (ASM) storage, use the following commands:
    # crsctl delete css votedisk path_to_voting_disk
    # crsctl add css votedisk path_to_voting_disk

Note: You can migrate voting disks from non-ASM storage options to ASM without taking down the cluster. To use an ASM disk group to manage the voting disks, you must set the COMPATIBLE.ASM attribute to 11.2.0.0.

Checking the Integrity of Oracle Clusterware Configuration Files

Use the cluvfy utility or the ocrcheck command to check the integrity of the OCR.

$ cluvfy comp ocr –n all -verbose

$ ocrcheck 


Related Link:

Managing Oracle Cluster Registry and Voting Disks

Tuesday, February 25, 2014

Determining the Location of Oracle Clusterware Configuration Files

The two primary configuration file types for Oracle Clusterware are the voting disk and the Oracle Cluster Registry (OCR).

The location of the OCR file can be determined by using the cat /etc/oracle/ocr.loc command. 

To determine the location of the voting disk:
$ crsctl query css votedisk
##  STATE  File Universal Id                File Name  Disk group
--  -----  -----------------                ---------- ----------
 1. ONLINE 8c2e45d734c64f8abf9f136990f3daf8 (ASMDISK01) [DATA]
 2. ONLINE 99bc153df3b84fb4bf071d916089fd4a (ASMDISK02) [DATA]
 3. ONLINE 0b090b6b19154fc1bf5913bc70340921 (ASMDISK03) [DATA]
Located 3 voting disk(s).


To determine the location of the OCR:
$ ocrcheck -config
Oracle Cluster Registry configuration is :
         Device/File Name         :      +DATA

Controlling Oracle High Availability Services

The crsctl utility is used to invoke certain OHASD functions.
  • To stop Oracle High Availability Services:
    • Stop the Clusterware stack:
      • # crsctl stop cluster
    • Stop Oracle High Availability Services on the local server:
      • # crsctl stop crs

  • To display Oracle High Availability Services automatic startup configuration:
    • # crsctl config crs
If you intend to stop Oracle Clusterware on all or a list of nodes, then use the crsctl stop cluster command, because it prevents certain resources from being relocated to other servers in the cluster before the Oracle Clusterware stack is stopped on a particular server. If you must stop the Oracle High Availability Services on one or more nodes, then wait until the crsctl stop cluster command completes and then run the crsctl stop crs command on any particular nodes, as necessary.

Use the crsctl config crs command to display Oracle High Availability Services automatic startup configuration. 


To determine the overall health on a specific node:

$ crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online

To check the viability of Cluster Synchronization Services (CSS) across nodes:

$ crsctl check cluster -all
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online

Controlling Oracle Clusterware


  • To start or stop Oracle Clusterware on a specific node: 
    • # crsctl start cluster 
    • # crsctl stop cluster 

  • To enable or disable Oracle Clusterware on a specific node:
    • # crsctl enable crs
    • # crsctl disable crs

Managing Clusterware with Enterprise Manager


Enterprise Manager Database Control provides facilities to manage Oracle Clusterware. This includes the ability to register and manage resources.

The example in this slide provides a typical illustration of the management interface. It shows how resources can be dynamically relocated from one node to another within your cluster. In this case, my_resource is relocated from host02 to host01.

Managing Oracle Clusterware

  • Command-line utilities
    • crsctl manages clusterware-related operations:
      • Starting and stopping Oracle Clusterware
      • Enabling and disabling Oracle Clusterware daemons
      • Registering cluster resources
    • srvctl manages Oracle resource–related operations:
      • Starting and stopping database instances and services
    • oifcfg can be used to define and administer network interfaces.
  • Enterprise Manager
    • Browser-based graphical user interface
    • Enterprise Manager cluster management available in:
      • Database control – within the cluster
      • Grid control – through a centralized management server

Oracle Automatic Storage Management - ASM


  • ASM is a volume manager and file system.
  • ASM operates efficiently in both clustered and nonclustered environments.
  • ASM is installed in the Grid Infrastructure home
  • Separate from the Oracle Database home.

ASM Key Features and Benefits
  • Stripes files rather than logical volumes
  • Provides redundancy on a file basis
  • Enables online disk reconfiguration and dynamic  rebalancing
  • Reduces the time significantly to resynchronize a transient failure by tracking changes while disk is offline
  • Provides adjustable rebalancing speed
  • Is cluster-aware
  • Supports reading from mirrored copy instead of primary copy for extended clusters
  • Is automatically installed as part of the Grid Infrastructure

GPnP Architecture Overview


GPnP Service

  • The GPnP service is collectively provided by all the GPnP agents. 
  • It is a distributed method of replicating profiles. 
  • The service is instantiated on each node in the domain as a GPnP agent. 
  • The service is peer-to-peer; there is no master process. This allows high availability because any GPnP agent can crash and new nodes will still be serviced. 
  • GPnP requires standard IP multicast protocol (provided by mDNS), to locate peer services. Using multicast discovery, GPnP locates peers without configuration. This is how a GPnP agent on a new node locates another agent that may have a profile it should use.

Name Resolution

A name defined within a GPnP domain is resolvable in the following cases:

  • Hosts inside the GPnP domain use normal DNS to resolve the names of hosts outside of the GPnP domain. They contact the regular DNS service and proceed. They may get the address of the DNS server by global configuration or by having been told by DHCP.
  • Within the GPnP domain, host names are resolved using mDNS. This requires an mDNS responder on each node that knows the names and addresses used by this node, and operating system client library support for name resolution using this multicast protocol. Given a name, a client executes gethostbyname, resulting in an mDNS query. If the name exists, the responder on the node that owns the name will respond with the IP address.

The client software may cache the resolution for the given time-to-live value.

  • Machines outside the GPnP domain cannot resolve names in the GPnP domain by using multicast. To resolve these names, they use their regular DNS. The provisioning authority arranges the global DNS to delegate a subdomain (zone) to a known address that is in the GPnP domain. GPnP creates a service called GNS to resolve the GPnP names on that fixed address.
The node on which the GNS server is running listens for DNS requests. On receipt, they translate and forward to mDNS, collect responses, translate, and send back to the outside client. GNS is “virtual” because it is stateless. Any node in the multicast domain may host the server. The only GNS configuration is global:
    • The address on which to listen on standard DNS port 53
    • The name(s) of the domains to serviced
There may be as many GNS entities as needed for availability reasons. Oracle-provided GNS may use CRS to ensure availability of a single GNS provider.

SCAN and Local Listeners

When a client submits a connection request, the SCAN listener listening on a SCAN IP address and the SCAN port are contacted on the client’s behalf. 

Because all services on the cluster are registered with the SCAN listener, the SCAN listener replies with the address of the local listener on the least-loaded node where the service is currently being offered. 

Finally, the client establishes a connection to the service through the listener on the node where service is offered. All these actions take place transparently to the client without any explicit configuration required in the client.

During installation, listeners are created on nodes for the SCAN IP addresses. Oracle Net Services routes application requests to the least loaded instance providing the service. 

Because the SCAN addresses resolve to the cluster, rather than to a node address in the cluster, nodes can be added to or removed from the cluster without affecting the SCAN address configuration.

How GPnP Works: Cluster Node Startup

  • IP addresses are negotiated for public interfaces using DHCP:
    • VIPs
    • SCAN VIPs
  • A GPnP agent is started from the nodes Clusterware home.
  • The GPnP agent either gets its profile locally or from one of the peer GPnP agents that responds.
  • Shared storage is configured to match profile requirements.
  • Service startup is specified in the profile, which includes:
    • Grid Naming Service for external names resolution
    • Single-client access name (SCAN) listener 

How GPnP Works: Client Database Connections



In a GPnP environment, the database client no longer has to use the TNS address to contact the listener on a target node. Instead, it can use the EZConnect method to connect to the database. 

When resolving the address listed in the connect string, the DNS will forward the resolution request to the GNS with the SCAN VIP address for the chosen SCAN listener and the name of the database service that is desired. In EZConnect syntax, this would look like:

scan-name.cluster-name.company.com/ServiceName, where the service name might be the database name. The GNS will respond to the DNS server with the IP address matching the name given; this address is then used by the client to contact the SCAN listener. The SCAN listener uses its connection load balancing system to pick an appropriate listener, whose name it returns to the client in an OracleNet Redirect message. The client reconnects to the selected listener, resolving the name through a call to the GNS.

The SCAN listeners must be known to all the database listener nodes and clients. The database instance nodes cross-register only with known SCAN listeners, also sending them per-service connection metrics. The SCAN known to the database servers may be profile data or stored in OCR.


Controlling Oracle Clusterware

The crsctl utility is used to invoke certain OHASD functions.

To start or stop Oracle Clusterware on all nodes:
# crsctl start cluster
# crsctl stop cluster

To enable or disable Oracle Clusterware for automatic startup on a specific node:
# crsctl enable crs
# crsctl disable crs

To check the status of CRS on the local node:

# crsctl check cluster

Verifying the Status of Oracle Clusterware

$ crsctl check cluster -all
***********************************************************
host01:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
***********************************************************
host02:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
***********************************************************
host03:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
***********************************************************

Viewing the High Availability Services Stack
$ crsctl stat res -init -t
---------------------------------------------------------------
NAME           TARGET  STATE        SERVER        STATE_DETAILS      
---------------------------------------------------------------
Cluster Resources
---------------------------------------------------------------
ora.asm
      1        ONLINE  ONLINE       host01        Started            
ora.cluster_interconnect.haip
      1        ONLINE  ONLINE       host01                                      
ora.crf
      1        ONLINE  ONLINE       host01                                      
ora.crsd
      1        ONLINE  ONLINE       host01                                      
ora.cssd
      1        ONLINE  ONLINE       host01                                      
ora.cssdmonitor
      1        ONLINE  ONLINE       host01                                      
ora.ctssd
      1        ONLINE  ONLINE       host01        OBSERVER          
ora.evmd
      1        ONLINE  ONLINE       host01                                      
...

Oracle Clusterware Initialization



During the installation of Oracle Clusterware, the init.ohasd startup script is copied to /etc/init.d . The wrapper script is responsible for setting up environment variables and then starting the Oracle Clusterware daemons and processes.

The Oracle High Availability Services daemon (ohasd) is responsible for starting in proper order, monitoring, and restarting other local Oracle daemons including the crds daemon, which manages clusterwide resources. When init starts ohasd on Clusterware startup, ohasd starts orarootagent,cssdagent, and oraagent. Some of the high availability daemons will be running under the root user with real-time priority, and others will be running under the Clusterware owner with user-mode priorities after they are started. When a command is used to stop Oracle Clusterware, the daemons will be stopped, but the ohasd process will remain running.




When a cluster node boots, or Clusterware is started on a running clusterware node, the init process starts ohasd. The ohasd process then initiates the startup of the processes in the lower, or Oracle High Availability (OHASD) stack. 
  • The cssdagent process is started, which in turn, starts cssd. The cssd process discovers the voting disk either in ASM or on shared storage, and then joins the cluster. The cssdagent process monitors the cluster and provides I/O fencing. This service formerly was provided by Oracle Process Monitor Daemon (oprocd). A cssdagent failure may result in Oracle Clusterware restarting the node. 
  • The orarootagent is started. This process is a specialized oraagent process that helps crsd start and manage resources owned by root, such as the network and the grid virtual IP address.

  • The oraagent process is started. It is responsible for starting processes that do not need to be run as root. 
  • The oraagent process extends clusterware to support Oracle-specific requirements and complex resources. This process runs server callout scripts when FAN events occur. This process was known as RACG in Oracle Clusterware 11g Release 1 (11.1). 
  • The cssdmonitor is started and is responsible for monitoring the cssd daemon.







Oracle Local Registry

  • The Oracle Local Registry (OLR) is a registry similar to OCR and is located on each node in a cluster, but contains information specific to each node. 
  • It contains manageability information about Oracle Clusterware, including dependencies between various services. 
  • Oracle High Availability Services uses this information. 
  • OLR is located on local storage on each node in a cluster. 
  • Its default location is in the path Grid_home/cdata/host_name.olr, where Grid_home is the Oracle Grid Infrastructure home, and host_name is the host name of the node. 

To check the OLR, execute the ocrcheck -local command on the desired node.

$ ocrcheck –local

To view the contents of the OLR, execute the ocrdump -local command, redirecting the output to stdout:

$ ocrdump -local -stdout

# ocrcheck -local
Status of Oracle Local Registry is as follows :
   Version                  :          3
   Total space (kbytes)     :     262120
   Used space (kbytes)      :       2644
   Available space (kbytes) :     259476
   ID                       :  250248496
   Device/File Name         : /u01/app/11.2.0/grid/cdata/host01.olr
         Device/File integrity check succeeded
         Local registry integrity check succeeded
         Logical corruption check succeeded

CSS Voting Disk Function


  • CSS is the service that determines which nodes in the cluster are available and provides cluster group membership and simple locking services to other processes. 
  • CSS typically determines node availability via communication through a dedicated private network with a voting disk used as a secondary communication mechanism. This is done by sending heartbeat messages through the network and the voting disk as illustrated by the top graphic in the slide. 
  • The voting disk is a file on a clustered file system that is accessible to all nodes in the cluster. 
  • Its primary purpose is to help in situations where the private network communication fails. The voting disk is then used to communicate the node state information used to determine which nodes go offline. 
  • Without the voting disk, it can be difficult for isolated nodes to determine whether it is experiencing a network failure or whether the other nodes are no longer available. 
  • It would then be possible for the cluster to enter a state where multiple subclusters of nodes would have unsynchronized access to the same database files. The bottom graphic illustrates what happens when Node3 can no longer send heartbeats to other members of the cluster. When others can no longer see Node3’s heartbeats, they decide to evict that node by using the voting disk. When Node3 reads the removal message or “kill block,” it generally reboots itself to ensure that all outstanding write I/Os are lost. 
  • Oracle Clusterware supports up to 15 redundant voting disks.
  • Note: The voting disk or file is usually known as quorum disk in vendor clusterware.

Oracle Cluster Registry-OCR






  • Cluster configuration information is maintained in the OCR. 
  • The OCR relies on distributed shared cache architecture for optimizing queries, and clusterwide atomic updates against the cluster registry. 
  • Each node in the cluster maintains an in-memory copy of OCR, along with the CRSD that accesses its OCR cache. Only one of the CRSD processes actually reads from and writes to the OCR file on shared storage. This process is responsible for refreshing its own local cache, as well as the OCR cache on other nodes in the cluster. 
  • For queries against the cluster registry, the OCR clients communicate directly with the local CRS daemon (CRSD) process on the node from which they originate. When clients need to update the OCR, they communicate through their local CRSD process to the CRSD process that is performing input/output (I/O) for writing to the registry on disk. 
  • The main OCR client applications are OUI, SRVCTL, Enterprise Manager (EM), the Database Configuration Assistant (DBCA), the Database Upgrade Assistant (DBUA), Network Configuration Assistant (NETCA), and the ASM Configuration Assistant (ASMCA). 
  • The installation process for Oracle Clusterware gives you the option of automatically mirroring OCR. This creates a second OCR file, which is called the OCR mirror file, to duplicate the original OCR file, which is called the primary OCR file. Although it is recommended to mirror your OCR, you are not forced to do it during installation. 

The Oracle Grid Infrastructure installation defines three locations for the OCR and supports up to five. New installations to raw devices are no longer supported.

Monday, February 24, 2014

Oracle Clusterware

Oracle Clusterware is:

  • A key part of Oracle Grid Infrastructure 
  • Integrated with Oracle Automatic Storage Management (ASM)
  • The basis for ASM Cluster File System (ACFS)
  • A foundation for Oracle Real Application Clusters (RAC)
  • A generalized cluster infrastructure for all kinds of applications


Oracle Clusterware is a key part of Oracle Grid Infrastructure, which also includes Automatic Storage Management (ASM) and the ASM Cluster File System (ACFS).

In Release 11.2, Oracle Clusterware can use ASM for all the shared files required by the cluster. Oracle Clusterware is also an enabler for the ASM Cluster File System, a generalized cluster file system that can be used for most file-based data such as documents, spreadsheets, and reports.

The combination of Oracle Clusterware, ASM, and ACFS provides administrators with a unified cluster solution that is not only the foundation for the Oracle Real Application Clusters (RAC) database, but can also be applied to all kinds of other applications.

Note: Grid Infrastructure is the collective term that encompasses Oracle Clusterware, ASM, and ACFS. These components are so tightly integrated that they are often collectively referred to as Oracle Grid Infrastructure.


What Is Clusterware?

Clusterware is a term used to describe software that provides interfaces and services that enable and support a cluster.

Different cluster architectures require clusterware that delivers different services. For example, in a simple failover cluster, the clusterware may monitor the availability of applications and perform a failover operation if a cluster node becomes unavailable. In a load balancing cluster, different services are required to support workload concurrency and coordination.

Typically, clusterware includes capabilities that:

  • •Allow the cluster to be managed as a single entity (not including OS requirements), if desired
  • •Protect the integrity of the cluster so that data is protected and the cluster continues to function even if  communication with a cluster node is severed
  • •Maintain a registry of resources so that their location is known across the cluster and so that  dependencies between resources is maintained
  • •Deal with changes to the cluster such as node additions, removals, or failures • Provide a common  view of resources such as network addresses and files in a file system

What Is a Cluster?



A cluster consists of two or more independent, but interconnected, servers. Several hardware vendors have
provided cluster capability over the years to meet a variety of needs. Some clusters were intended only to provide high availability by allowing work to be transferred to a secondary node if the active node fails. Others were designed to provide scalability by allowing user connections or work to be distributed across the nodes.

Another common feature of a cluster is that it should appear to an application as if it were a single server. Similarly, management of several servers should be as similar to the management of a single server as possible. The cluster management software provides this transparency.

For the nodes to act as if they were a single server, files must be stored in such a way that they can be found by the specific node that needs them. There are several different cluster topologies that address the data access issue, each dependent on the primary goals of the cluster designer.

The interconnect is a physical network used as a means of communication between each node of the cluster. In short, a cluster is a group of independent servers that cooperate as a single system.

Note: The clusters you are going to manipulate in this course all have the same operating system. This is a requirement for RAC clusters.