跳转到主内容

ic.HAInterconnectLinkDown 频繁发生

Views:
184
Visibility:
Public
Votes:
2
Category:
fas-systems
Specialty:
hw
Last Updated:

Color_Def.png

仅在指定的内容块内添加文本。单击此处 查看有关创建解决循环内容的更多信息。

适用于

  • ONTAP 9
  • AFF-A400
  • FAS8300
  • FAS8700
  • HA 互连

问题

  • "system ha interconnect status show" 出现 link 0link 1 处于关闭状态。

Cluster::*> system ha interconnect status show

           Node:node-1

       Link 0 Status: down
       Link 1 Status: down
     Is Link 0 Active: false
     Is Link 1 Active: false

 IC RDMA Connection: up

           Node:node-2
       Link 0 Status: down
       Link 1 Status: down
     Is Link 0 Active: false
     Is Link 1 Active: false
    IC RDMA Connection: up
2 entries were
displayed.

  • EMS 日志每 1 小时报告一次"ic.HAInterconnectLinkDown"。

[?] Thu Apr 28 02:00:00 +0000 [node-1: statd: ic.HAInterconnectLinkDown:error]: HA interconnect: External link #0 has been down for 73387 minutes.
[?] Thu Apr 28 03:00:00 +0000 [node-1: statd: ic.HAInterconnectLinkDown:error]: HA interconnect: External link #0 has been down for 73447 minutes.
[?] Thu Apr 28 04:00:00 +0000 [node-1: statd: ic.HAInterconnectLinkDown:error]: HA interconnect: External link #0 has been down for 73507 minutes.
[?] Thu Apr 28 05:00:00 +0000 [node-1: statd: ic.HAInterconnectLinkDown:error]: HA interconnect: External link #0 has been down for 73567 minutes.

[?] Thu Apr 28 02:00:00 +0000 [node-2: statd: ic.HAInterconnectLinkDown:error]: HA interconnect: External link #0 has been down for 73309 minutes. 
[?] Thu Apr 28 03:00:00 +0000 [node-2: statd: ic.HAInterconnectLinkDown:error]: HA interconnect: External link #0 has been down for 73429 minutes.
[?] Thu Apr 28 04:00:00 +0000 [node-2: statd: ic.HAInterconnectLinkDown:error]: HA interconnect: External link #0 has been down for 73489 minutes.
[?] Thu Apr 28 05:00:00 +0000 [node-2: statd: ic.HAInterconnectLinkDown:error]: HA interconnect: External link #0 has been down for 73549 minutes.

  • node-1 "sysconfig -a" e0a SFP 信息为空。

slot 0: 10G/25G Ethernet Controller CX5
  e0a MAC Address:   d0:39:ea:38:fb:bb (auto-unknown-fd-down)
    SFP Vendor:      
    SFP Part Number:      
    SFP Serial Number:      
  e0b MAC Address:   d0:39:ea:38:fb:bc (auto-25g_cr-fd-up)
    SFP Vendor:      Molex
    SFP Part Number:   1111455002
    SFP Serial Number:  XXXXXXXXXXXXXX
  Device Type:     CX5 PSID(NAP0000000006)
  Firmware Version:   16.26.4012

原因

  • 未正确读取 Interconnect 中的 SFP 信息
  • Interconnect 电缆或板载端口硬件故障

解决方案

  1. 检查并重新拔插 node-1 e0a 和 node-2 e0a SFP 电缆。
  2. 如果重新拔插电缆无法解决此问题,请更换 node-1 e0a 到 node-2 e0a SFP 电缆。
  3. 如果更换 SFP 电缆无法解决上述问题,请从 node-2 执行接管。
  4. 执行 CFO 回馈并执行本地环回测试以确定哪个节点是问题的原因。
  5. 要获得更多帮助,请联系 NetApp 技术支持 并参考本文。

追加信息

追加信息_text

内部参考

  • 在处理与 HA IC 端口和 HA 集群端口相关的问题时,仅凭 LED 状态不足以确定哪一侧出现故障。
  • 以下所有步骤对于确定故障方都很重要。
  1. 重新拔插电缆
  2. 更换电缆
  3. 环回测试
  4. 检查 ASUP 日志

示例:

案例# 2009500112

  • 当我们进行网络电缆重新插拔测试时,node1 的端口 e0b 上的 LED 未点亮(off),而 node2 的端口 e0b 上的 LED 持续绿色
  • 因此,我们认为 node1 的网络端口有问题。
  • 然而,后来的事实证明,是 node2 的网络端口问题导致 node1 的网络端口上的 LED 熄灭。
  • 在此示例中,sysconfig -a 输出包含关键信息,即 node2 的端口信息缺少 SFP 相关信息

Node1

 slot 0: 10G/25G Ethernet Controller CX5
  e0a MAC Address:   d0:39:ea:xx:xx:xx (auto-25g_cr-fd-up)
    SFP Vendor:      Molex
    SFP Part Number:   1111455002
    SFP Serial Number:  xxxxxxxxxxxxx
  e0b MAC Address:   d0:39:ea: xx:xx:xx (auto-unknown-fd-down)
    SFP Vendor:      Molex
    SFP Part Number:   1111455002
    SFP Serial Number:  xxxxxxxxxxxxx
  Device Type:     CX5 PSID(NAP0000000006)
  Firmware Version:   16.26.4012

Node2

 slot 0: 10G/25G Ethernet Controller CX5
  e0a MAC Address:   d0:39:ea: xx:xx:xx (auto-25g_cr-fd-up)
    SFP Vendor:      Molex
    SFP Part Number:   1111455002
    SFP Serial Number:  xxxxxxxxxxxxx
  e0b MAC Address:   d0:39:ea: xx:xx:xx (auto-unknown-fd-down)
    SFP Vendor:      
    SFP Part Number:      
    SFP Serial Number:      
  Device Type:     CX5 PSID(NAP0000000006)
  Firmware Version:   16.26.4012

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.