Friday, November 2, 2012

Clusterware Networking


Clusterware Networking
When designing your cluster architecture you will need to consider the physical networking that will be required. You will need to consider the network adapters, high availability architectures, use of jumbo frames and assignment of IP addresses. Let’s look at each of these topics in more detail. 

Network Adapters

Each server to be used in an Oracle Clusterware configuration will require a minimum of two separate network components. These two adapters are
  • The public network adapter 
  • The private cluster interconnect
Public Network Adapter
·         The first is the public network adapter. 
·         The public network adapter must support TCP/IP and is generally connected to the local internet via a switch.
·         Administrators and users will connect to the database via the public network adapter.
·         Many servers have more than one public network adapter and these are often “bonded”. Bonding provides the ability to combine each public network adapter into one logical network adapter.
·         Bonding provides more throughput and thus better performance. Performance on the public network is quite important for the overall performance of the database.
·         High performing networks make for faster file transfers (for example backups or export/import) and faster transfer of large amounts of application data.

Private Network Adapter
·         The second required network adapter is required to act as the cluster interconnect.
·         The cluster interconnect should support TCP/IP and UDP (when using UNIX or Windows platforms).
·         The cluster interconnect is a local connection between the individual servers, and should not be part of a public network.
·         The public interconnects are for local traffic between the clustered servers. Because the private interconnect traffic should not mix with public traffic then the private interconnect connections should have their own switches.
·         Cross-over cables are not supported, and you must ensure that your configuration is supported by Oracle. Bonded interfaces are supported for improved throughput.



Once you have configured Oracle Clusterware and installed RAC databases you may wish to confirm that the RAC databases are using the correct adapter for the Private Network Adapter.

·         You can use the V$CLUSTER_INTERCONNECT view from any node of the RAC database.
·         You can also use the Oracle oradebug command to determine this information. You can access oradebug through the Oracle SQL*Plus utility.
·         You can then use the oradebug setmypid and oradebug ipc commands to determine if the correct adapter is being used for the private interconnect.
·         Here is an example:
/u01>sqlplus / as sysdba
SQL> ORADEBUG SETMYPID
SQL> oradebug ipc




No Single Points of Failure Please

It is very important to avoid a single point of failure when it comes to your networking. This includes making sure you have redundant interfaces, redundant switches, redundant everything to ensure that no one piece can break.

Administering RAC Network Settings

Oracle Clusterware/RAC administrators will want to be able to monitor and administer network settings related to the cluster. Tools like oifcfg and svrctl are available for these tasks.
Using oifcfg you can:
  • List the interfaces available to the cluster as seen in this example:
oifcfg iflist –p -n
  • Determine the public and private interfaces that have been configured for use as seen in this example:
oifcfg getif

You can use the srvctl command to determine the VIP hostname, address, subnet mask and interface name as seen in this example:

srvctl config nodeapps –a





Changing the Private Interconnect Adapter
Recall that all RAC Clusters have at least two types of network adapters. The first is the public adapter and the second is the private interconnect. You may have cause to change the specification for the interconnect adapter (such as changing the IP address). Here are the instructions to do just that:
1.      Log-on to one node of the cluster. Add the new global interface specification using the oifcfg command as seen here:
               oifcfg setif –global eth2/192.0.2.0:cluster_interconnect
2.      Check the changes to make sure they took effect with the oifcfg command as seen here:
               oifcfg getif
3.      Stop the cluster using the crsctl command. Execute this on each node of the cluster:
               crsctl stop crs
4.      On each node, use the ifconfig command to assign the network address to the adapters:
               Ifconfig eth2 192.0.2.15 netmask 255.255.255.0 broadcast 192.0.2.255
5.      Remove the previous adapter specification using the oifcfg command as seen in this example:
               oifcfg delif –global eth1/192.168.1.0

Do not execute this step until you are sure that the replacement interface has been properly added into your cluster.
6.      Restart the Clusterware using the crsctl command:
               crsctl start crs










High Availability Networking Architectures
Your networking configurations should be architected for high availability. In a best case situation, both the private interconnect and the public interconnect would have bonded, redundant network cards and redundant switches. As a reminder, the interface names on each node of the cluster (and slots) need to be the same. Thus if the private interconnect is on eth1 on node one, it needs to be on eth1 on the remaining nodes.

Jumbo Frames
·         Ethernet packages network messages in “frames” which can be of a variable size.
·         The frame size is called the MTU or the maximum transmission unit. When a message is larger than the MTU size (typically 1500 bytes), then it is split into multiple messages.
·         This splitting of messages incurs additional overhead and network traffic. This can lead to RAC performance problems. In Oracle two main factors can influence the maximum size of a message:
§  DB_BLOCK_SIZE * DB_FILE_MULTI_BLOCK_READ_COUNT (MBRC) impacts the maximum size of a message for the global cache.  For example, a DB_BLOCK_SIZE of 8k and a MBRC of 32 will result in a maximum message size of 256k. This will result in approximately 170 separate packets. Using Jumbo frames (9k packet sizes) would result in only approximately 28 packets. 
§  PARELLEL_EXECUTION_MESSAGE_SIZE (defaults to 16384 bytes) determines the maximum size of a message used in parallel execution. These messages can range from 2k to over 64k.
One solution to these issues is to configure the cluster interconnect to use Jumbo Frames. When you configure Jumbo Frames the Ethernet frame size can be as large as 9k in size. Configuring for Jumbo Frames requires some careful configuration. The following steps are required to configure jumbo frames (these steps might vary based on your hardware and operating system):
  • Check Oracle Metalink for more information on implementing Jumbo Frames. Check for information specific to your hardware, operating system and any bugs that might exist.
  • Configure the host's network adapter with a persistent MTU size of 9000. On a Unix system you might use the command ifconfig -mtu 9000.
  • Check your vendor NIC configuration requirements. The NIC may well require some configuration.
  • Ensure your switches will support the larger frame size and configure the LAN switches to increase the MTU for Jumbo Frame support.
  • You can use traceroute for some basic configuration testing as seen in this example where we do a traceroute with a 9k packet size and a 9001 byte packet size:
               traceroute –F linux01.myhost.com 9000

traceroute –F linux01.myhost.com 9001
Note that Jumbo Frames do not have any standard that they adhere to. As a result Jumbo Frames the interoperability between switches can be problematic and can require advanced networking skills to troubleshoot. Also keep in mind that the smallest MTU used by any device in a given network path will determine the maximum MTU for all traffic travelling along that path. As with all changes make sure that you completely test your configuration before implementing it in production.


No comments:

Post a Comment