Friday, February 21, 2014

Building A Windows Cluster Using Windows Server 2008 R2

Building a windows cluster is the first stage in deploying a high availability SQL Server 2012 Always On cluster. This document covers the design, installation, and configuration of a Windows Cluster on which a SQL Server 2012 HA Cluster can be deployed.

NETWORKING

The network design is crucial in the design and implementation of a cluster. A cluster relies heavily on advanced networking features to route traffic hitting a Virtual Network Name (VNN) with it's corresponding IP address to the correct node in the cluster. Another management layer of networking enables the cluster nodes to conduct a voting process which enables the cluster nodes as a group to decide which nodes are healthy and which node is the active primary. An misconfiguration in the networking can cause cluster instability and outages.
 

VLANS / CLIENT NETWORKING ZONES

For more information on networking refer to the DBA - GRID Networking Document.
 

SINGLE SITE

A cluster that is located completely within a single site / data centre requires two VLANs, one for internal cluster communications and one for general application connectivity. 

INTERNAL CLUSTER COMMUNICATIONS

This is a private, non-routable network used exclusively by the nodes within the cluster. Cluster node heartbeat and the quorum voting process between nodes happen on this network.

PUBLIC NETWORK (APPLICATIONS)

The public network is used by applications on the cluster to provide services. Client applications will connect to the cluster over this network. This network is fully routable in the same manner as a standard server connection.
 

MULTI SITE

A cluster that spans multiple sites / data centres requires four VLANS. The two VLANs described above are duplicated at each site. Networking and routing needs to be setup to allow communications between the client networking zones at each site. Eg. The cluster network at site A requires connectivity with the cluster network at site B.

INTERNAL CLUSTER COMMUNICATIONS

As stated above the internal cluster network at site A requires routing to the internal cluster network at site B. Connectivity between the two sites must allow traffic on the port TCP_3343.

PUBLIC NETWORK (APPLICATIONS)

The public networks must route traffic between site A and site B. The ports that are required depend on the application.
 

NETWORK DESIGN

Before building a cluster the networking requirements should be fully mapped out. The IPs assigned should map in a logical fashion across the networks. For each node in the cluster the following networking requirements need to be identified:
Example
SITE Which site will the cluster node reside?
PUBLIC IP What will be the public IP address of this node? 192.168.100.10
CLUSTER IP What will be the cluster internal IP address of this node?
Also note - the IP address should match between networks. Eg. .10 should refer to the same cluster node in both subnets.
192.168.0.10

The cluster will also require a virtual IP in each subnet. Ie a single site cluster will require one IP, a two site cluster will require two IPs. Again these should be logically assigned so they map across network zones.
 

STANDARD FIREWALL RULES

Inter public site
Public communications
Inter private cluster comms
Quorum file server comms
 

QUORUMS

IS A QUORUM FILE SHARE WITNESS REQUIRED?

A quorum file share witness is required whenever there are an even number of nodes in the cluster. The quorum file share witness is used in the voting process to break any deadlocks that may occur when there are an even number of nodes in a cluster. If there are an odd number of nodes in the cluster the quorum file share witness should not be configured. 
 

CREATING A QUORUM FILE SHARE

The quorum file server is configured with a folder called E:\FileShares. Create a folder under this with the name of the cluster. Share this folder out with default permissions. Make sure to change the share permissions so that Everyone has read / write access.
 

SERVER BUILDS

NETWORK CONFIGURATION

Each node in the cluster has two network interfaces - an interface for cluster communications (CLUSTER) and another for general server traffic (PUBLIC - eg. SQL, SMB, RDP). Each of these interfaces is on a separate subnet. In particular the cluster communications is segregated on it's own subnet so that the cluster heartbeat does not get interrupted by general network traffic.
 

RENAMING THE NICS

As shown below we rename the network connections on each cluster node. This makes network identification and cluster troubleshooting much easier. Open the Network Connections page in Control Panel. Select each network interface and press F2 to rename the connection.
clip_image002
 

NIC ADAPTER ORDER

It's also important to set the NIC adapter order in the network settings. We want the public interface to be used for communications first. If the cluster interface is first then the server will try and use the cluster network for communications first. This will cause general network problems as the cluster network is a private network without access to general network resources and the internet. 
The screen to configure the NIC Adapter order can be found by navigating to the Network Connections page in Control Panel. Press the Alt key to bring the window menu up --> Advanced Settings.
clip_image004
 

WINDOWS IP SETTINGS

The CLUSTER network settings should be setup similar to below. Note - no DNS servers should be set. This is a private network, no DNS traffic will be sent out over the cluster network and in any event, DNS traffic cannot be routed to a DNS server. 
clip_image006
 
The PUBLIC network settings should be as per normal server build.
clip_image008

 

PREPARING THE SERVER

UPGRADING WINDOWS SERVER STANDARD EDITION TO ENTERPRISE

Clustering is a Windows Enterprise feature. If Windows Standard template has been deployed the Windows edition will require an upgrade. Run the following command from and elevated command prompt to upgrade the Windows edition:
DISM /online /Set-Edition:ServerEnterprise /ProductKey:XXXXX-XXXXX-XXXXX-XXXXX-XXXXX
After running this command Windows requires a reboot. This reboot will take a while as there is reconfiguration on both shutdown and restart.
 

SETUP THE FAILOVER CLUSTER

SETUP FAILOVER CLUSTERING

The Failover Clustering feature needs to be added to each node in the cluster. Accept the defaults to enable the feature.
clip_image010
 

CREATING THE CLUSTER

Expand Features on the primary node. 
Right click on Failover Cluster Manager and select Create a Cluster
clip_image012
 
Add each server node in the following screen:
clip_image014
 
Run the validation tests. Select the defaults and run all tests.
The validation tests make sure that the server node hardware and software are valid for the configuration. Any entries that fail should be fixed before continuing with the cluster creation. There will be some warnings and this is typically OK as we are not using SAN visible to the node or shared disks.
clip_image016
 
Setup the cluster name in the next screen. Disable the CLUSTER network as we will be managing the cluster via the PUBLIC network. Also assign the clustered IP address 
clip_image018
 
Keep following the prompts to finish the cluster build.
 

UPDATE CLUSTER NETWORKING

Re-name each cluster network to accurately reflect the VLAN topology as shown below. In this particular case there will only be two Networks in Failover Cluster Manager: SQL-UAT (PUBLIC) and CLUSTER.
clip_image020
 
Update the CLUSTER networks and uncheck the box "Allow clients to connect through this network". This will reserve the network for cluster communications (eg. Cluster heartbeat).
clip_image022
 
The PUBLIC cluster networks should be setup so that the cluster does not use these networks for cluster communications as shown below:
clip_image024
 
 

ADDING THE QUORUM FILE SHARE WITNESS

Note - this step is only required for clusters with an even number of server nodes. if there are an odd number of server nodes skip this.
Clusters much reach a quorum on which server is the primary node of the cluster. Clusters do this by requiring each node to vote on which node is the active node - the node with the majority of the votes is the active node. When there are an even number of nodes in a cluster a quorum file share can be used as a witness and provide an additional vote, thus breaking deadlocks in the quorum votes.
In this case our two node UAT HA cluster requires a quorum to ensure that a node has a majority vote at all times. To configure Quorum settings right click on the cluster name --> More Actions --> Configure Cluster Quorum Settings:
clip_image025
 
From the Select Quorum Configuration screen select the third option - Node and File Share Majority
clip_image027
 
When prompted enter the details of the Quorum file share (\\<QuorumServer>\Quorums\<ClusterName). This will create a small folder inside the share with cluster specific quorum information.
 

POST CLUSTER CREATION TESTS

Post cluster build the cluster needs to be reviewed to ensure that it is configured properly. Key tests performed are:
  1. Review the Failover Cluster Manager MMC snap-in and ensure there are no errors / warnings
  2. Perform ping tests on the cluster name and make sure it resolves
  3. Check the current owner of the cluster
  4. Perform a cluster failover to the secondary node
    1. Open a powershell prompt as an administrator
    2. Run the following command: get-clustergroup | move-clustergroup -Node "SQLUAT05A"
    3. Check the status of the cluster in Failover Cluster Manager
  5. Perform a cluster fail back to the primary node
    1. Open a powershell prompt as an administrator
    2. Run the following command: get-clustergroup | move-clustergroup -Node "SQLUAT05B"
    3. Check the status of the cluster in Failover Cluster Manager


















































































No comments:

Post a Comment