
在BMC 1.81或更高版本上意外重新启动A700s

aff-series<a>A700s</a><a>HA 互连</a><a>2008524003</a><a>ic1a</a><a>BURT 1403180</a>
Last Updated:


  • AFF A700
  • BMC 1.81或更高版本


  • AFF A700s节点意外重新启动。
  • 服务处理器将重置节点、配对节点将接管:

[node_name_2: cf_hwassist: cf.hwassist.takeoverTrapRecv:notice]: hw_assist: Received takeover hw_assist alert from partner(node_name_1), system_down because reset_via_sp.
W[node_name_2: cf_hwassist: cf.hwassist.takeoverTrapRecv:notice]: hw_assist: Received takeover hw_assist alert from partner(node_name_1), system_down because l2_watchdog_reset.

[node_name_2: swi1: mri_ha: nvmm.mirror.aborting:debug]: mirror of sysid 1, partner_type HA Partner and mirror state MIRROR_ONLINE is aborted because of reason Abort Pending.
[node_name_2: gop_eq_thread: ic.linkStatusChange:info]: HA interconnect: Port ic1a link is down.
[node_name_2: cf_fastTimeout: cf.ic.heartBeatFailed:error]: HA interconnect: Heartbeat failed.
[node_name_2: cf_main: cf.fsm.takeoverByPartnerDisabled:error]: Failover monitor: takeover of node_name_2 by node_name_1 disabled (unsynchronized log).
[node_name_2: rastrace_dump: rastrace.dump.saved:debug]: A RAS trace dump for module IC instance 0 was stored in /etc/log/rastrace/IC_0_20201027_17:15:50:245981.dmp.
[node_name_2: ctrl_hb_port_ic1a: ctrl.rdma.heartBeat:info]: HA interconnect: Missed heartbeat to
[node_name_2: cf_main: cf.fsm.takeoverByPartnerDisabled:error]: Failover monitor: takeover of node_name_2 by node_name_1 disabled (HA interconnect error. Verify that the partner node is running and that the HA interconnect cabling is correct, if applicable. For further assistance, contact technical support).

  • 受影响的节点将重新启动、并在执行监督重置后进行恢复后可能工作正常
  • BMC SEL日志显示NMI和监视程序信息:

  420 | 03/03/2023 | 17:51:10 | CriticalInt | Software NMI | Asserted
  421 | 03/03/2023 | 17:51:10 | Watchdog2 | Timer interrupt | Asserted
  422 | 03/03/2023 | 17:51:12 | Watchdog2 | Hard reset | Asserted
  423 | 03/03/2023 | 17:51:12 | SysReset | State Asserted | Asserted
  424 | 03/03/2023 | 18:20:22 | Platform Security #0x00 | Transition to Off Line | Asserted


