Problem with Microsoft Failover Cluster Shared Disks

Weird one. Well, weird to me.

I have a 2 node MS cluster. Tried both 2008 R2 and now 2012.

ESX 5.1 on 2 hosts, 5 others still on 4.1. 4.1 hosts are out of scope for now (pretty sure!).

2 individual Windows Cluster nodes, VM-A on Host-A, VM-B on Host-B, both 5.1. OS drives are each hosted on separate VMFS5 Thick provisioned drives. RDM's are set for physical, I have 2 presented and attached.

SAN is Oracle Openstorage 7320 2-node storage cluster, ALUA type. Each node has a pool of storage it hosts off as optimized/primary owner. All paths show correctly in vCenter.

Primary reason for using RDM is that I need a file partition of 4TB in size for use in a medium-sized pool for roaming profiles in a Citrix implementation. Prefer not to get crazy with DFS and different volumes, because of high intra-breeding (ha!) of people across departments, locations and functions. Very challenging to try to find a logical structure that sticks for too long, so rather go for 1 large store.

2 RDM's presented to all hosts. Device attachment has perennially-attached flag set to true.

VM-A does raw mapping of 2GB LUN meant for quorum. Device attaches just fine, set to Physical and Physical SCSI Bus Sharing done, set for 1:0. Next VM-A does RDM to the 4TB LUN, same deal, set for 2:0. Each map file stored with VM (I've tried placing in an "RDM Mapping LUN" which didn't work - problem described below).

Go over to VM-B and map to existing drives, physical, physical, 1:0... yadda yadda.

Turn on VM-A. Drives seen, great. Turn on server VM-B. Drives there. Great.

Create cluster. Run tests, passes everything 100%. Set up the cluster, all seems well.

Then I reboot a node. Drives fail over and that goes well. Then I notice they bump to offline, then quorum (the 2GB LUN) comes back online, 4TB LUN goes to Failed. Manually restarting works every time.

So I thought maybe it was a failover thing. All resources owned by VM-B, I reboot VM-A. Drives all get bumped offline. Quorum drive tends to come back almost immediately. 4TB fails quick, but never comes back. If I shut down instead of reboot, same behavior. Once VM-A is powered off, I can manually start any disk which failed and it'll run forever. If I turn VM-A back on while everything running on VM-B, drives fail during POST of VM-A almost immediately. Again, I have to manually bring online.

On Windows 2012, I've tried using Scale Out File Sharing - thinking maybe I just need to get both head reading/writing at the same time to maintain some kind of connection. Works great, by the way - but still fails in the same way. 2008 R2, same thing in exact same ways.

Have I missed something obvious here?

We aren't talking about a SCSI time out setting here. This isn't a time out, it's almost like a lock conflict happens and the last one in immediately loses. Timeout values are set to 60, but this is not a timeout. As soon as I shut down, or turn on, a VM, the other guy flips out almost immediately.

I've had no issues at all with storage with any of these hosts on this SAN setup, pretty damn fast, too. I can sling data and LUNs around and everything sees and connects to whatever I have, where ever I put it, never an issue with pathing, ownership, zoning, etc.

Anyone ever see anything like this?

Problem with Microsoft Failover Cluster Shared Disks

Trending Articles

Scuffham Amps - S-GEAR 2.6.0 VST, AAX, STANDALONE x86 x64 (R2R NO iLok2, +NO...

Practice Sheet of Right form of verbs for HSC Students

VHSE First (1st) Allotment 2025 - vhscap.kerala.gov.in

UNIVERSE LEAGUE – UNIVERSE LEAGUE – WAR (We Are Ready) – EP [iTunes Plus M4A]

City Hunter Teledrama – Episode 18 – 07th May 2016

Comment on Proposed Criteria for Identifying Predatory Conferences by Luke...

Bureau of Internal Revenue: Regional Offices (Directory)

Kendrick Lamar – Not Like Us (2024) [24Bit-88.2kHz] [PMEDIA] ⭐️

Inception 2010 Hindi Dual Audio 650MB BRRip 720p ESubs HEVC

East Hull MD admits sexual assaults after another victim comes forward

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

R. v. Sargeant, 2023 ONSC 6406 (CanLII)

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Who’s been sentenced at Northampton Magistrates’ Court

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Family cries out as traditional ruler allegedly abducts brother, extorts N2.5m

Long-Running Conflict In Springfield (MA) Gangland Sphere Has Manzi Family &...

Wondershare Filmora X v10.1.20.16 x64

Man arrested after fracas in flat

Man charged in ongoing Sexual Assault Investigation Derek Nyilas, 46, Faces...