AFF A250、AFF C250 或 FAS500f panic:看门狗 NMI,因为 IPMI 接口拥塞
适用于
- AFF A250,AFF C250
- ASA A250、ASA C250
- FAS500f
- BMC 15.11 或更早版本
问题描述
- 节点死机重启:
PANIC: watchdog nmi because IPMI interface congested. in process idle: cpu9- BMC 事件日志 或
SP-LATEST-SYSTEM-EVENT-LOG指示看门狗中断,后跟多个总线可更正错误:
292 | 12/02/2022 | 17:14:14 | Watchdog 2 #0x0f | Timer interrupt | Asserted293 | 12/02/2022 | 17:14:16 | Watchdog 2 #0x0f | Hard reset | Asserted294 | 12/02/2022 | 17:14:17 | Unknown #0x51 | State Asserted2a2 | 12/02/2022 | 17:15:00 | Critical Interrupt #0x31 | Bus Correctable error | Asserted2a3 | 12/02/2022 | 17:15:00 | Critical Interrupt #0x31 | Bus Correctable error | Asserted2a4 | 12/02/2022 | 17:15:00 | Critical Interrupt #0x31 | Bus Correctable error | Asserted- SSRAM 日志报告
NMIsource(WdogBMCFail) - 紧急重新启动后,节点可能会在
Cluster::>event log show报告温度错误
[monitor.temp.unreadable:error]: The controller temperature (HIC2 Temp0) is not readable.
[monitor.temp.unreadable:error]: The controller temperature (HIC2 Temp1) is not readable.
[callhome.chassis.hitemp:error]: Call home for CHASSIS OVER TEMPERATURE