跳转到主内容

MetroCluster IP中的节点意外重新启动

Views:
Visibility:
Public
Votes:
0
Category:
metrocluster
Specialty:
metrocluster
Last Updated:

适用场景

  • ONTAP 9
  • MetroCluster IP
  • AF-A700
  • X91166A T6卡

问题描述

  • 节点意外重新启动、并且没有问题描述指示
  • 显示HA配对节点正在获取磁盘预留的SP日志、该预留可能会在接管(管理接管)后发生: 

Apr 18 04:01:40 [NodeA1:clam.node.ooq:EMERGENCY]: Node (name=NodeA2,
ID=1001) is out of "CLAM quorum" (reason=quorum update).

A disk reservation was detected on disk 7a.10.3P3 at 18Apr2023 04:01:44   
Ordinarily, this will only occur if the partner node has taken over.
This node will be shutdown.
HALT: HA partner has taken over disk reservations
Uptime: 47d18h37m13s
System rebooting...

  • 由于检测信号丢失、在重新启动和接管触发之前不久会报告HA互连超时:
Sun Apr 18 20:35:39 +0200 [NodeA1: DR_heartbeat_thread: cf.ic.xferTimedOut:error]: HA interconnect: MCC_DRSOM transfer timed out. Sun Apr 18 20:35:39 +0200 [NodeA1: cf_firmware: cf.ic.xferTimedOut:error]: HA interconnect: OFW transfer timed out. Sun Apr 18 20:35:58 +0200 [NodeA1: cf_main:cf.fsm.takeover.noHeartbeat:alert]: Failover monitor: Takeover initiated after no heartbeat was detected from the partner node.

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.