We've had transient datastore failures, and are trying to understand our mitigation options.
It's documented / we've observed that if an esxi host loses access to a datastore, virtual machines will continue to run on that host from some amount of time before the virtual machine is told the disks are read-only. This time appears to be on the order of 5 to 15 minutes. If the datastore is restored within some grace period, it resumes operating normally.
* What factors affect the grace time before the virtual machine is affected?
* Are there any vSphere configuration options which can affect the grace period for specific virtual machines?