Dear,
Strange issue on my system DL380Gen8 (not on VMHCL). It looses all external connectivity. When I logon through the out of band interface (ILO) ; I can still ping all VM's and they are running.
However the external interfaces looses all connectivity (kernel + VMWare guest uplinks).
Restart management agents doesn't help.
When I look at my kernel log I see my NVMe card + NTG3 (network driver) complaining :
2017-02-01T01:23:39.574Z cpu9:68121)User: 3089: sfcb-smx: wantCoreDump:sfcb-smx signal:6 exitCode:0 coredump:enabled
2017-02-01T01:23:39.703Z cpu9:68121)UserDump: 3024: sfcb-smx: Dumping cartel 68117 (from world 68121) to file /var/core/sfcb-smx-zdump.002 ...
2017-02-01T01:23:41.992Z cpu9:68121)UserDump: 3172: sfcb-smx: Userworld(sfcb-smx) coredump complete.
2017-02-01T10:20:26.125Z cpu2:69084)nvme:nvmeCoreLogError:370:command failed: 0x43077bd885f0.
2017-02-01T10:22:27.081Z cpu2:68970)nvme:nvmeCoreLogError:370:command failed: 0x43077bd70bf0.
2017-02-01T10:24:28.580Z cpu2:68970)nvme:nvmeCoreLogError:370:command failed: 0x43077bd71370.
2017-02-01T10:26:31.329Z cpu2:69175)nvme:nvmeCoreLogError:370:command failed: 0x43077bd71970.
2017-02-01T10:28:32.559Z cpu2:69175)nvme:nvmeCoreLogError:370:command failed: 0x43077bd71f70.
2017-02-01T10:30:49.130Z cpu2:68998)nvme:nvmeCoreLogError:370:command failed: 0x43077bd72570.
2017-02-01T10:32:50.089Z cpu2:69195)nvme:nvmeCoreLogError:370:command failed: 0x43077bd72b70.
2017-02-01T10:34:53.349Z cpu2:69134)nvme:nvmeCoreLogError:370:command failed: 0x43077bd73170.
2017-02-01T10:36:54.443Z cpu2:69040)nvme:nvmeCoreLogError:370:command failed: 0x43077bd73770
2017-02-01T16:18:36.497Z cpu1:68999)WARNING: ntg3-throttled: Ntg3XmitPktList:372: vmnic0:TX ring full (0)
2017-02-01T16:18:45.193Z cpu22:65645)ntg3:vmnic0:Ntg3UplinkReset:665:Ntg3UplinkReset
2017-02-01T16:18:45.193Z cpu22:65645)ntg3:vmnic0:Ntg3UplinkQuiesceIO:647:Ntg3UplinkQuiesceIO
2017-02-01T16:18:45.193Z cpu22:65645)ntg3:vmnic0:Ntg3UplinkStartIO:623:Ntg3UplinkStartIO
2017-02-01T16:18:55.193Z cpu21:65645)ntg3:vmnic0:Ntg3UplinkReset:665:Ntg3UplinkReset
2017-02-01T16:18:55.193Z cpu21:65645)ntg3:vmnic0:Ntg3UplinkQuiesceIO:647:Ntg3UplinkQuiesceIO
2017-02-01T16:18:55.193Z cpu21:65645)ntg3:vmnic0:Ntg3UplinkStartIO:623:Ntg3UplinkStartIO
2017-02-01T16:19:05.195Z cpu21:65645)ntg3:vmnic0:Ntg3UplinkReset:665:Ntg3UplinkReset
2017-02-01T16:19:05.195Z cpu21:65645)ntg3:vmnic0:Ntg3UplinkQuiesceIO:647:Ntg3UplinkQuiesceIO
2017-02-01T16:19:05.195Z cpu21:65645)ntg3:vmnic0:Ntg3UplinkStartIO:623:Ntg3UplinkStartIO
2017-02-01T15:50:50.684Z cpu9:68980)WARNING: NetPort: 1932: failed to disable port 0x2000005 on vSwitch0: Busy
2017-02-01T15:50:50.684Z cpu9:68980)NetSched: 701: 0x2000002: received a force quiesce for port 0x2000005, dropped 727 pkts
2017-02-01T15:50:50.685Z cpu9:68980)NetPort: 1879: disabled port 0x2000005
2017-02-01T15:50:50.688Z cpu9:68980)Vmxnet3: 17265: Disable Rx queuing; queue size 256 is larger than Vmxnet3RxQueueLimit limit of 64.
2017-02-01T15:50:50.688Z cpu9:68980)Vmxnet3: 17623: Using default queue delivery for vmxnet3 for port 0x2000005
2017-02-01T15:50:50.688Z cpu9:68980)NetPort: 1660: enabled port 0x2000005 with mac 00:50:56:a4:3e:25
2017-02-01T15:50:50.699Z cpu9:68980)NetPort: 1879: disabled port 0x2000005
2017-02-01T15:50:50.701Z cpu9:68980)Vmxnet3: 17265: Disable Rx queuing; queue size 256 is larger than Vmxnet3RxQueueLimit limit of 64.
2017-02-01T15:50:50.701Z cpu9:68980)Vmxnet3: 17623: Using default queue delivery for vmxnet3 for port 0x2000005
2017-02-01T15:50:50.701Z cpu9:68980)NetPort: 1660: enabled port 0x2000005 with mac 00:50:56:a4:3e:25
2017-02-01T15:50:56.216Z cpu0:68971)WARNING: ntg3-throttled: Ntg3XmitPktList:372: vmnic0:TX ring full (0)
when I restart the box, all goes fine again for sometimes 1 day, 1 week... unclear... somebody an idea?