We have several hosts running on Cisco UCS C210M2 with LSI MegaRaid SAS 9261-8i and ESXi 4.1 Build 582267. We are observing in Vsphere Client Hardware Status the "Battery" status is Normal and the respective "Host Battery Status" alarm is NOT raised on this host when in fact the BBU has failed and marked as "needs to be replaced"
Only after enabling SSH and uploading the MegaCLI utility and manually querying the BBU status we are able to conclude the BBU is bad. This NOT alarmed condition of the BBU in turn puts the Virtual Disk's Write Cache policy in to Write Through mode impacting IO performance.
Has anyone using the LSI MegaRaid with ESXi come across such BBU failure that is not detected by VMWare ?
Here is what the MegaCli reports in this condition
If I query the VMWare_HHRCBattery CIM Class there is no indication that the battery is bad
~$ date;wbemcli ei -noverify 'https://root@esxi-host.com:5989/root/cimv2:VMware_HHRCBattery//root@esxi-host.com:5989/root/cimv2:VMware_HHRCBattery'
Wed May 9 15:06:53 EDT 2012
Enter password:
esxi-host.com:5989/root/cimv2:VMware_HHRCBattery.CreationClassName="VMware_HHRCBattery",DeviceID="vmwControllerBattery0",SystemCreationClassName="OMC_UnitaryComputerSystem",SystemName="dc5fdc64-9cad-11e0-bfac-e8b7487c201c" CardType=2,TransitioningToState=12,SystemName="dc5fdc64-9cad-11e0-bfac-e8b7487c201c",SystemCreationClassName="OMC_UnitaryComputerSystem",RequestedState=11,OperationalStatus=2,HealthState=5,EnabledState=2,EnabledDefault=2,ElementName="Battery on Controller 0 ",DeviceID="vmwControllerBattery0",CreationClassName="VMware_HHRCBattery",Caption="Battery on Controller 0 ",BatteryStatus=3,RemainingCapacityMaxError=,RemainingCapacity=,MaxRechargeCount=,RechargeCount=,MaxRechargeTime=,ExpectedLife=,TimeToFullCharge=,SmartBatteryVersion=,DesignVoltage=,FullChargeCapacity=,DesignCapacity=,Chemistry=,EstimatedChargeRemaining=,EstimatedRunTime=,TimeOnBattery=,LocationIndicator=,MaxQuiesceTime=,AdditionalAvailability=,IdentifyingDescriptions=,TotalPowerOnHours=,PowerOnHours=,OtherIdentifyingInfo=,ErrorCleared=,ErrorDescription=,LastErrorCode=,StatusInfo=,Availability=,PowerManagementCapabilities=,PowerManagementSupported=,Generation=,Description=,InstanceID=,InstallDate=,Name=,StatusDescriptions=,Status=,PrimaryStatus=,DetailedStatus=,OperatingStatus=,CommunicationStatus=,OtherEnabledState=,TimeOfLastStateChange=,AvailableRequestedStates=,RatedMaxOutputPower=,OutputPowerUnits=,IsACOutput=
I'm assuming this CIM class is used to trigger the built in "Host battery status" alarm anyone know for sure ?