跳转到主内容

使用BMC固件15.7或更低版本时、AFF A250节点意外重新启动

Views:
13
Visibility:
Public
Votes:
0
Category:
aff-series
Specialty:
hw
Last Updated:

适用场景

  • AF-A250
  • 基板管理控制器(BMC)固件 15.7或更低版本

问题描述

  • 意外节点暂停:

[node_name: spmgrd: sp.heartbeat.stopped:error]: Have not received a IPMI heartbeat from the Service Processor (SP) in last 600 seconds.
[node_name: spmgrd: callhome.sp.hbt.missed:notice]: Call home for SP HBT MISSED
[node_name: spmgrd: callhome.sp.hbt.stopped:alert]: Call home for SP HBT STOPPED
[node_name: env_mgr: sp.ipmi.lost.shutdown:EMERGENCY]: SP heartbeat stopped and cannot be recovered. To prevent hardware damage and data loss, the system will shut down in 10 minutes.
[node_name: env_mgr: monitor.shutdown.emergency:EMERGENCY]: Emergency shutdown: Environmental Reason Shutdown (System reboot to recover the BMC)
[node_name: mgwd: mgwd.notify.halt.result:info]: MGWD able to notify CLAM on its HA partner node that this node is undergoing a planned shutdown (reason: E). Error: -

  • SP-LATEST-SYSTEM-EVENT-LOG 或命令 system log sel 指示IPMI冷重置有多个总线可更正的错误:
BMC node_name> system log sel
3e1 | 03/08/2023 | 16:09:46 | Critical Interrupt #0x31 | Bus Correctable error | Asserted
3e2 | 03/08/2023 | 16:09:46 | Critical Interrupt #0x31 | Bus Correctable error | Asserted
...
3f1 | OEM record f2 | IPMI cold reset
3f2 | OEM record f2 | Pilot Software reset
  • 或通过FPGA重置BMC: 
 1c9 | OEM record f2 | FPGA pull BMC whole reset
 1ca | OEM record f2 | Pilot AC cycle
  • 可能 无法访问此节点的BMC、即使 通过串行控制台端口也是如此

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.