Microsoft Virtual Academy: Windows Server 2012 Tracks

As soon as Windows Server 2012 went into General Availability MVA released a fair amount of videos to cover the new functionalities. So watch and learn for free:

Windows Server 2012: Technical Overview

Server Virtualization Module

Networking Module

Storage Module

Identity and Access Module

Virtual Desktop Infrastructure Module

Web and Application Platform Module

Windows Server 2012: Server Virtualization

Virtual Machine and Scale Mobility Module

Windows Server 2012: Storage

Storage Spaces Module

SMB 3.0 Module

File System Enhancements Module

Windows Server 2012: Networking

Managing Network Infrastructure Module

IP Address Management (IPAM) Module

Hyper-V Network Virtualization Module

Windows Server 2012: Management & Automation

Standard Based Management with Windows Management Framework Module

Multi-Server Management with Server Manager Module

You can also logon to MVA and find the track there.

Resolving Cluster Shared Volumes put in redirected access on Hyper-V Cluster

Recently I’ve stumbled on issue where the clustered shared volumes on Hyper-V cluster were put in redirected access. But this doesn’t happens by itself. Here is how the issue appeared and how it was resolved.

I support 5 node Hyper-V cluster which have 5 clustered shared volumes and quorum disk. It all began when there was some minor interruption in SAN storage service. Even though Storage team didn’t detect issues this interception was detected by all servers that had LUNs connected to that storage so not only the Hyper-v cluster was the affected one. After the storage issue was fixed we noticed that there were several alerts related to the cluster’s nodes in SCOM. The description of the alerts was the following:

Cluster Shared Volume ‘Volume1′ (Volume1’) is no longer directly accessible from this cluster node. I/O access will be redirected to the storage device over the network through the node that owns the volume. This may result in degraded performance. If redirected access is turned on for this volume, please turn it off. If redirected access is turned off, please troubleshoot this node’s connectivity to the storage device and I/O will resume to a healthy state once connectivity to the storage device is reestablished.

There was separate alert for every CSV on the cluster. At first I tried to return one of the CSV’s to normal state by going in the Failover Cluster Manager console->Cluster Shared Volumes->Right click on one of the volumes->More Actions->Turn on redirected access for this Cluster shared volume.

clip_image006

This didn’t work out. The command stared execution but later it timed out and I cancelled it. So I searched in Bing to find more information about the problem. And I found the following article:

http://blogs.technet.com/b/askcore/archive/2010/12/16/troubleshooting-redirected-access-on-a-cluster-shared-volume-csv.aspx

The article was clear statement that this issue was caused by storage connectivity issue. After some more granular investigation I noticed that on one of the nodes in the cluster the LUNs were not present in the Disk Management console. And because one of the nodes didn’t had this configuration the cluster was not fully healthy and in order to preserve it’s integrity forced itself to work in redirected access mode. Because of that all the virtual machines on the cluster were still up and running.

In such situation I had two choices to resolve the issue:

  • Restart the server and see if disk configuration will return
  • Add the LUNs to the server again

I decided to go with the first option because it was more easy for execution and I could always rely on option 2 if 1 was unsuccessful. I’ve put the faulty node in maintenance mode in Virtual Machine Manager and in Operations Manager. All virtual machines were migrated and I restarted the server. After the server was up and running again the configuration in Disk Manager was back and all CSV’s were no longer I redirected access mode. I’ve stopped maintenance mode in VMM and the node was back on the cluster.

I suspect why exactly this node lost its disk configuration  during the Storage service interruption: of all 5 nodes in the cluster only this one had different HBA cards than the other four. But of course this would never happened if Storage service didn’t had issues that day.

 

P.S. The screenshot was copied from the mentioned article.