更换崩溃节点后,更换节点无法启动并在 POST 时停止,并显示错误 "IPMI :获取中板 FRU 0 清单:失败 "
适用场景
- FAS27x0
- AFF-A220
- FAS8200
- AFF-A300
问题描述
- 节点崩溃并显示 "NMI watch-dog reset" 错误,更换节点后,出现以下错误并在开机自检时停止系统。
示例:
Initializing System Memory ...
Loading Device Drivers ...
Waiting for SP ...
SP failure. Resetting SP from primary FW. This can take a few minutes Waiting for SP ...
SP failure. Resetting SP from backup FW. This can take a few minutes Waiting for SP ...
Failed to recover SP
IPMI:Get controller FRU inventory:failed
IPMI:Get midplane FRU 0 inventory:failed
- 在服务处理器( SP )日志中,会出现以下中板错误。
示例 1 :
Record 717: Wed Dec 25 01:38:47.000000 2019 [SysFW.notice]: BIOS Version: 11.5
Record 718: Wed Dec 25 01:38:49.000000 2019 [SysFW.notice]: Waiting for SP ...
Record 719: Wed Dec 25 01:38:49.000000 2019 [SysFW.notice]: IPMI:Read midplane FRU common header:device busy. Retrying
Record 720: Sun Jan 01 00:00:33.660000 2017 [BMC.notice]: Running primary version 11.4
扫描检查 2 :
Record 649: Sat Oct 15 13:05:38.834785 2016 [Agent.notice]: 720.089: 150 : Local Serial Exchange Error Internal MLER[4] asserted
Record 1003: Mon Oct 17 08:38:52.362718 2016 [Agent.notice]: 193.618: 151 : Local Invalid Serial Exchange Bus Internal MLER[5] asserted
Record 807: Thu Jan 01 00:00:36.931067 1970 [Agent.notice]: 000.267: 152 : Midplane I2C Local Buffers Not Ready Internal MLER[6] de-asserted
Record 797: Mon Oct 17 08:52:11.001689 2016 [Agent.notice]: 919.800: 148 : Midplane Local Grant Timeout Internal MLER[2] asserted