由于SP HBT已停止、系统关闭、并在启动时出现KCS错误
适用场景
- AFF A250、AFF C250
- ASA A250、ASA C250
- FAS500f
问题描述
- 由于SP HBT已停止、节点关闭:
Sat Aug 19 03:46:24 -0400 [cluster-01: spmgrd: sp.heartbeat.stopped:debug]: Have not received a IPMI heartbeat from the Service Processor (SP) in last 600 seconds.Sat Aug 19 03:46:24 -0400 [cluster-01: spmgrd: callhome.sp.hbt.missed:debug]: Call home for SP HBT MISSEDSat Aug 19 03:56:44 -0400 [cluster-01: spmgrd: callhome.sp.hbt.stopped:debug]: Call home for SP HBT STOPPEDSat Aug 19 03:59:08 -0400 [cluster-01: env_mgr: sp.ipmi.lost.shutdown:EMERGENCY]: SP heartbeat stopped and cannot be recovered. To prevent hardware damage and data loss, the system will shut down in 10 minutes.Sat Aug 19 04:09:08 -0400 [cluster-01: env_mgr: monitor.shutdown.emergency:EMERGENCY]: Emergency shutdown: Environmental Reason Shutdown (System reboot to recover the BMC)- 配对节点在重新启动时接管:
Sat Aug 19 04:09:33 -0400 [cluster-02: cf_main: cf.fsm.takeover.on.reboot:debug]: Failover monitor: One node initiated automatic takeover after detecting that its partner node is rebooting.- 串行控制台指示节点处于Loader状态。在转换到BMC命令行界面(Ctrl-G)时、控制台会显示类似于以下内容的输出:
sh: can't create /sys/module/watchdog_hw/parameters/current_wdt_device: nonexistent directorysh: can't create /sys/module/watchdog_hw/parameters/current_wdt_device: nonexistent directoryKCS cmd(NETFN 0x6, CMD 0x1) failed, ret -2- 重新启动节点不起作用。
- 此节点仍无法启动、并且BMC仍无响应。
- 尝试从Loader
boot_ontap会导致启动时出现以下情况:
KCS cmd(NETFN 0xa, CMD 0x10) failed, ret -2KCS cmd(NETFN 0xa, CMD 0x10) failed, ret -2KCS cmd(NETFN 0xa, CMD 0x10) failed, ret -2KCS cmd(NETFN 0xa, CMD 0x10) failed, ret -2Could not patch the required SMBIOS 1 field 1 with the FRU data.KCS cmd(NETFN 0xa, CMD 0x10) failed, ret -2KCS cmd(NETFN 0xa, CMD 0x10) failed, ret -2Copyright(c) 2021 American Megatrends, Inc. ��Copyright(c) 2021 American Megatrends, Inc. ��ERROR: Class:0; Subclass:20000; Operation: 1002Boot Loader version 6.5.8 Copyright (C) 2000-2003 Broadcom Corporation.Portions Copyright (C) 2002-2023 NetApp, Inc. All Rights Reserved.KCS cmd(NETFN 0x6, CMD 0x1) failed, ret -2Resetting BMC from backup FW...Waiting 30 seconds for BMC to reboot...KCS cmd(NETFN 0x6, CMD 0x1) failed, ret -2Copyright(c) 2021 American Megatrends, Inc. ��ERROR: Class:0; Subclass:20000; Operation: 1002