A300/FAS8200 ， A200/FAS2600 ， A220/FAS2700 ， C190 上的 e0a/e0b 链路板可能导致接管

最后更新
另存为PDF

Views:: 282

Visibility:: Public

Votes:: 0

Category:: fas-systems

Specialty:: hw

Last Updated:

状态信息

适用场景

AFF A300/FAS8200
AFF A200 ， FAS2650 ， FAS2620
AFF A220 ， AFF C190 ， FAS2750 ， FAS2720
ONTAP 9

问题描述

集群端口 e0a 或 e0b （或两个端口）会同时出现链路挡板或关闭。

Tue Oct 03 11:08:31 CEST [node1: ixgbe/e0b: snmp.link.down:info]: Interface 2 is down. Tue Oct 03 11:08:31 CEST [node1: ixgbe/e0b: netif.linkDown:info]: Ethernet e0b: Link down, check cable. Tue Oct 03 11:08:31 CEST [node1: ixgbe/e0a: snmp.link.down:info]: Interface 1 is down. Tue Oct 03 11:08:31 CEST [node1: ixgbe/e0a: netif.linkDown:info]: Ethernet e0a: Link down, check cable. Tue Oct 03 11:08:32 CEST [node2: ixgbe/e0b: snmp.link.down:info]: Interface 2 is down. Tue Oct 03 11:08:32 CEST [node2: ixgbe/e0b: netif.linkDown:info]: Ethernet e0b: Link down, check cable. Tue Oct 03 11:08:32 CEST [node2: ixgbe/e0a: snmp.link.down:info]: Interface 1 is down. Tue Oct 03 11:08:32 CEST [node2: ixgbe/e0a: netif.linkDown:info]: Ethernet e0a: Link down, check cable.

检查集群端口状态和存储故障转移状态：

cluster::> network port show -role cluster (network port show) Node: node1 Speed(Mbps) Health Port IPspace Broadcast Domain Link MTU Admin/Oper Status --------- ------------ ---------------- ---- ---- ----------- -------- e0a Cluster Cluster down 9000 1000/- - e0b Cluster Cluster down 9000 1000/- - Node: node2 Speed(Mbps) Health Port IPspace Broadcast Domain Link MTU Admin/Oper Status --------- ------------ ---------------- ---- ---- ----------- -------- e0a Cluster Cluster down 9000 1000/- - e0b Cluster Cluster down 9000 1000/- - 4 entries were displayed.

cluster::storage failover> storage failover show Takeover Node Partner Possible State Description ------------- -------------- -------- ------------------------------------- cluster-01 cluster-02 false Connected to cluster-02, Partial giveback, Takeover is not possible: The version of software running on each node of the SFO pair is incompatible, NVRAM log not synchronized cluster-02 cluster-01 - Waiting for cluster applications to come online on the local node Offline applications: mgmt, vldb, vifmgr, bcomd, crs.

如果端口没有备份、并且启用了连接、则激活和可用性监视器（ CLAM ）

其中一个节点上将出现“超出法定人数”紧急情况。

PANIC : Received PANIC packet from partner, receiving message is (Coredump and takeover initiated because Connectivity, Liveliness and Availability Monitor (CLAM) has determined this node is out of quorum.

将接管出现紧急情况的节点、正常运行的节点将为所有数据提供服务。

如果端口没有备份、并且连接不启用活动和可用性监视器（ CLAM ）

不会发生存储接管，并且两个节点都将超出仲裁。两个节点都不会提供数据。
请参见： SU436 ： "Impact ： critical" （影响：严重）状态下的命令执行接管默认配置已更改
在EMS日志中可以找到类似的消息：

Jun 08 12:30:09 [xxx-02:vifmgr.clus.linkdown:EMERGENCY]: The cluster port e0b on node naptp06c-02 has gone down unexpectedly. Jun 08 12:30:10 [xxxc-02:vifmgr.clus.linkdown:EMERGENCY]: The cluster port e0a on node naptp06c-02 has gone down unexpectedly. Jun 08 12:31:00 [xxx-02:monitor.globalStatus.critical:EMERGENCY]: Controller failover of xxx-01 is not possible: partner mailbox disks not accessible or invalid. One or more mirrored aggregates are degraded. Jun 08 12:31:02 [xxx:callhome.clam.node.ooq:EMERGENCY]: Call home for NODE(S) OUT OF CLUSTER QUORUM.