H610S - BMC 自检失败
适用于
- NetApp SolidFire
- NetApp HCI
- NetApp Element 软件 12.3 或更高版本
问题描述
- 在某些情况下,会出现持续的集群故障:
BMC Self Test failed. This may impact IPMI based services and a BIOS/BMC update may be recommended.
- 可以在 sf-master.info 上看到以下条目
2021-11-22T00:14:03.084577Z hci-stg-06 master-1[26236]: [EXPERR-4] [Util] 28031 GlobalPool-0 serviceshared/IpmiComponentMonitor.cpp:272:CheckHealth|BMC Self Test failed. Postponing Fault. mBmcSelfTestFailureCount=1 cNumFailedSelfTestsForFault=10 2021-11-23T13:24:28.274373Z hci-stg-06 master-1[26236]: [EXPERR-4] [Util] 28027 GlobalPool-0 serviceshared/IpmiComponentMonitor.cpp:272:CheckHealth|BMC Self Test failed. Postponing Fault. mBmcSelfTestFailureCount=1 cNumFailedSelfTestsForFault=10 2021-12-16T15:25:16.921931Z hci-stg-06 master-1[57297]: [EXPERR-4] [Util] 55937 GlobalPool-0 serviceshared/IpmiComponentMonitor.cpp:272:CheckHealth|BMC Self Test failed. Postponing Fault. mBmcSelfTestFailureCount=1 cNumFailedSelfTestsForFault=10 2021-12-16T21:18:54.078117Z hci-stg-06 master-1[57297]: [EXPERR-4] [Util] 55937 GlobalPool-0 serviceshared/IpmiComponentMonitor.cpp:272:CheckHealth|BMC Self Test failed. Postponing Fault. mBmcSelfTestFailureCount=1 cNumFailedSelfTestsForFault=10- 除了"BMC Self Test failed"集群故障外,还可能存在以下任一或所有情况:
- BMC Web GUI 无法访问
- 节点脱机事件:
The SolidFire Application cannot communicate with node ID <#>Node Offline nodeID=<#>
- 无法 ping 或 SSH 到 BMC IP 地址
- 与风扇、电源和系统传感器相关的持续群集故障。 示例:
Fan1A RPM is failed or missing.Error checking sensor for Fan1B RPMError checking sensor for Inlet TempError checking sensor for Exhaust Temp- ipmitool 命令失败,出现以下错误:
Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directoryGet SEL Info command failed: Invalid commandError sending Chassis Status command: Invalid commandGet Channel Info command failed: Invalid command