在AFF A320平台上处理L2 watchdog重置
适用场景
- AFF A320
问题描述
- 节点意外重新启动
- 节点在意外关闭后不会重新启动
受影响节点上的BMC日志显示以下内容:
Record 402: Thu May 05 06:20:35.070000 2022 [ASUP.notice]: First notification email | (REBOOT (abnormal)) WARNING | Send failed
Record 403: Thu May 05 06:20:40.640000 2022 [IPMI.notice]: 0076 | 02 | EVT: 6fc302ff | System_Watchdog | Assertion Event, "Power cycle"
Record 404: Thu May 05 06:20:40.640000 2022 [IPMI Event.critical]: L2 watchdog timeout power cycle
- 如果节点重新启动、则在EMS日志文件中会显示以下错误
Thu May 05 15:33:43 +0800 [netapp: splog_main: mgr.boot.reason_abnormal:EMERGENCY]: System rebooted due to a watchdog reset.
Thu May 05 15:33:43 +0800 [netapp: splog_main: callhome.reboot.watchdog:alert]: Call home for REBOOT (watchdog reset)
- 如果节点无法重新启动、
system senors
则BMC可能会在中将显示Attn_Sensor1
为Asserted
Power_Event | 0x0 | discrete | | na | na | na | na
System_FW_Status | 0x0 | discrete | 0x2f | na | na | na | na
Wrench_Port_Up | 0x0 | discrete | Enabled | na | na | na | na
Attn_Sensor1 | 0x0 | discrete | Asserted | na | na | na | na