AFF A700s节点报告SP HBT已停止并紧急关闭以恢复BMC
适用场景
- AFF A700
- BMC (底板管理控制器)
问题描述
- ONTAP 升级后、节点会尝试自动更新BMC
- 一个或多个节点上的自动更新失败:
[cluster-01: servprocd: sp.servprocd.upd.evts:debug]: params: {'reason': 'BMC update - Pre-update checks passed.'}
[cluster-01: servprocd: sp.servprocd.upd.evts:debug]: params: {'reason': 'SP Firmware network update from 1.89 to 1.91 has been triggered.'}
[cluster-01: servprocd: sp.servprocd.upd.unexpt.evts:debug]: params: {'reason': 'BMC update - Update failed after timeout.'}
[cluster-01: servprocd: sp.servprocd.upd.error:error]: SP update error: SP firmware update failure has been detected.
[cluster-01: servprocd: sp.servprocd.upd.unexpt.evts:debug]: params: {'reason': 'BMC update pre-update checks failed.'}
[cluster-01: servprocd: sp.servprocd.upd.error:error]: SP update error: SP firmware update failure has been detected.
- 这会导致出现SP未命中和检测信号停止的AutoSupport 通知
[cluster-01: env_mgr: callhome.sp.hbt.missed:notice]: Call home for SP HBT MISSED
[cluster-01: env_mgr: callhome.sp.hbt.stopped:alert]: Call home for SP HBT STOPPED
- 出现此情况几天后、节点将暂停、并且无法通过BMC远程访问:
[cluster-01: env_mgr: monitor.shutdown.emergency:EMERGENCY]: Emergency shutdown: Environmental Reason Shutdown (System reboot to recover the BMC)
- 控制台与节点的连接表示没有BMC日志(
system log console
、system log console bak
system log sel
均为空或仅包含一个条目) - 尝试从加载程序启动节点会导致:
***************************************************
This platform is not supported in this release.
The system will now halt
***************************************************