H610s - BMC 自检失败
适用场景
NetApp H610S
问题描述
- 在某些情况下、您将收到永久性集群故障:
BMC Self Test failed. This may impact IPMI based services and a BIOS/BMC update may be recommended.
- 可以在上查看以下条目
sf-master.info
2021-11-22T00:14:03.084577Z hci-stg-06 master-1[26236]: [EXPERR-4] [Util] 28031 GlobalPool-0 serviceshared/IpmiComponentMonitor.cpp:272:CheckHealth|BMC Self Test failed. Postponing Fault. mBmcSelfTestFailureCount=1 cNumFailedSelfTestsForFault=10 2021-11-23T13:24:28.274373Z hci-stg-06 master-1[26236]: [EXPERR-4] [Util] 28027 GlobalPool-0 serviceshared/IpmiComponentMonitor.cpp:272:CheckHealth|BMC Self Test failed. Postponing Fault. mBmcSelfTestFailureCount=1 cNumFailedSelfTestsForFault=10 2021-12-16T15:25:16.921931Z hci-stg-06 master-1[57297]: [EXPERR-4] [Util] 55937 GlobalPool-0 serviceshared/IpmiComponentMonitor.cpp:272:CheckHealth|BMC Self Test failed. Postponing Fault. mBmcSelfTestFailureCount=1 cNumFailedSelfTestsForFault=10 2021-12-16T21:18:54.078117Z hci-stg-06 master-1[57297]: [EXPERR-4] [Util] 55937 GlobalPool-0 serviceshared/IpmiComponentMonitor.cpp:272:CheckHealth|BMC Self Test failed. Postponing Fault. mBmcSelfTestFailureCount=1 cNumFailedSelfTestsForFault=10
- 除了“ BMC Self Test Failed ”( BMC 自检失败)集群故障之外、还可能存在以下任何或所有情况:
- 无法访问 BMC Web GUI
- 节点脱机事件:
The SolidFire Application cannot communicate with node ID <#>
Node Offline nodeID=<#>
- 无法 ping 或 SSH 到 BMC IP 地址
- 与风扇、电源和系统传感器相关的持久集群故障。 示例
Fan1A RPM is failed or missing.
Error checking sensor for Fan1B RPM
Error checking sensor for Inlet Temp
Error checking sensor for Exhaust Temp
- ipmitool 命令失败并出现以下错误:
Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory
Get SEL Info command failed: Invalid command
Error sending Chassis Status command: Invalid command
Get Channel Info command failed: Invalid command