跳转到主内容

FAS8300、FAS8700、AFF/ASA C400 板载端口出现故障,导致接管

Views:
58
Visibility:
Public
Votes:
0
Category:
fas-systems
Specialty:
hw
Last Updated:

适用于

  • FAS8300
  • FAS8700
  • AFF A400、AFF C400
  • ASA A400、ASA C400

问题描述

  • 在 FAS8300、FAS8700、AFF A400、AFF C400、ASA A400 或 AFF C400 系统上的板载 e0a/e0b/e0c/e0d 以太网端口上可以看到链路关闭错误,然后是节点接管。 
    • 25GbE 以太网端口 e0a 和 e0b 用于 HA 互连
    • 40/100GbE 以太网端口 e0c 和 e0d 通常用于群集互连(插槽 3 中带有 X1151A 的 AFF A400 和 ASA A400 除外)
  • 看到以下错误:

Fri Oct 08 00:37:27 -0400 [node-01: kernel: netif.linkDown:info]: Ethernet e0a: Link down, check cable.
Fri Oct 08 00:37:27 -0400 [node-01: intr: rlib.ifconfig.linkEvent:notice]: params: {'eventType': 'DOWN', 'ifname': 'e0a'}
Fri Oct 08 00:37:27 -0400 [node-01: kernel: netif.linkDown:info]: Ethernet e0b: Link down, check cable.
Fri Oct 08 00:37:27 -0400 [node-01: intr: rlib.ifconfig.linkEvent:notice]: params: {'eventType': 'DOWN', 'ifname': 'e0b'


Fri Oct 08 00:37:27 -0400 [node-01: kernel: netif.linkDown:info]: Ethernet e0c: Link down, check cable.
Fri Oct 08 00:37:27 -0400 [node-01: kernel: netif.linkDown:info]: Ethernet e0d: Link down, check cable.
Fri Oct 08 00:37:28 -0400 [node-01: cf_main: cf.fsm.takeoverByPartnerDisabled:error]: Failover monitor: takeover of node-01 by node-02 disabled (HA interconnect error. Verify that the partner node is running and that the HA interconnect cabling is correct, if applicable. For further assistance, contact technical support).
Fri Oct 08 00:37:28 -0400 [node-01: cf_firmware: cf.fm.partnerFwTransition:info]: params: {'progresscounter': '0', 'newstate': 'SF_UNKNOWN', 'prevstate': 'SF_UP'}
Fri Oct 08 00:37:30 -0400 [node-01: nvmm_mirror_sync: nvmm.mirror.aborting:debug]: mirror of sysid 1, partner_type HA Partner and mirror state NVMM_MIRROR_LAYOUT_SYNCING is aborted because of reason NVPM_ERR_MSG_SEND_FAILED.
Fri Oct 08 00:37:30 -0400 [node-01: vifmgr: vifmgr.portdown:notice]: A link down event was received on node node-01, port e0c.
Fri Oct 08 00:37:30 -0400 [node-01: vifmgr: vifmgr.clus.linkdown:EMERGENCY]: The cluster port e0c on node node-01 has gone down unexpectedly.


Fri Oct 08 00:37:32 -0400 [node-01: cf_main: cf.fsm.partnerNotResponding:notice]: Failover monitor: partner not responding
Fri Oct 08 00:37:32 -0400 [node-01: cf_main: cf.fsm.takeoverCountdown:info]: Failover monitor: takeover scheduled in 10 seconds

Fri Oct 08 00:37:42 -0400 [node-01: cf_main: cf.fsm.takeover.noHeartbeat:alert]: Failover monitor: Takeover initiated after no heartbeat was detected from the partner node.
Fri Oct 08 00:37:42 -0400 [node-01: cf_main: cf.fsm.stateTransit:info]: Failover monitor: UP --> TAKEOVER

  • 集群端口关闭导致一个节点被接管。

Sep 25 04:34:47 [node02:vifmgr.clus.linkdown:EMERGENCY]: The cluster port e0c on node node02 has gone down unexpectedly.
Sep 25 04:36:45 [node02:vifmgr.clus.linkdown:EMERGENCY]: The cluster port e0d on node node02 has gone down unexpectedly.

  • 其中一个节点发生"超出仲裁范围"的恐慌。

PANIC  : Received PANIC packet from partner, receiving message is (Coredump and takeover initiated because Connectivity, Liveliness and Availability Monitor (CLAM) has determined this node is out of quorum.)

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.