跳转到主内容

由于网络故障, StorageGRID 节点连接状态未知 适配器

Views:
64
Visibility:
Public
Votes:
0
Category:
storagegrid
Specialty:
sgrid
Last Updated:

适用场景

  • NetApp 存储网格
  • 基于裸机的存储节点

问题描述

  • 网格管理器界面中的存储节点连接状态未知:
选择 Nodes > 选择相关节点 > Overview
connection-state-unknown.png
  • servermanager.log 表示存在网络问题描述:
2021-01-23 12:39:10 +0000 | dynip | Possible network isolation: Node has no contact with other nodes.
 
  • 基本操作系统消息日志显示 有关 i40e 的错误而 bond0 的所有接口 均已关闭:
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.0: HMC error interrupt
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.0: HMC error info 0x80000090, HMC error data 0x0
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.1: unhandled interrupt icr0=0x00010000
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.0: unhandled interrupt icr0=0x00010000
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.0: device will be reset
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.1: device will be reset
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.1: VSI seid 393 Tx ring 128 disable timeout
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.1: VSI seid 393 Rx ring 128 disable timeout
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.0: VSI seid 390 Tx ring 0 disable timeout
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.0: VSI seid 390 Rx ring 0 disable timeout
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.0: VSI seid 392 Tx ring 128 disable timeout
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.0: VSI seid 392 Rx ring 128 disable timeout
Jan 23 12:33:41 dc1-sn1 kernel: bond0: link status definitely down for interface eno1, disabling it
Jan 23 12:33:41 dc1-sn1 kernel: device eno1 left promiscuous mode
Jan 23 12:33:41 dc1-sn1 kernel: bond0: now running without any active interface!
Jan 23 12:33:41 dc1-sn1 kernel: bond0: link status definitely down for interface eno2, disabling it
Jan 23 12:33:57 dc1-sn1 kernel: i40e 0000:1a:00.1: PF reset failed, -15
Jan 23 12:33:57 dc1-sn1 kernel: i40e 0000:1a:00.0: PF reset failed, -15
Jan 23 12:34:01 dc1-sn1 kernel: i40e 0000:1a:00.1: Rebuild AdminQ failed, err I40E_ERR_ADMIN_QUEUE_TIMEOUT aq_err OK
Jan 23 12:34:01 dc1-sn1 kernel: i40e 0000:1a:00.0: Rebuild AdminQ failed, err I40E_ERR_ADMIN_QUEUE_TIMEOUT aq_err OK
Jan 23 12:34:01 dc1-sn1 kernel: i40e 0000:1a:00.0: ignoring delete macvlan error on PF, err I40E_ERR_QUEUE_EMPTY, aq_err OK
Jan 23 12:34:17 dc1-sn1 kernel: i40e 0000:1a:00.1: PF reset failed, -15
Jan 23 12:34:17 dc1-sn1 kernel: i40e 0000:1a:00.0: PF reset failed, -15
...
Jan 23 12:39:10 dc1-sn1 journal: Possible network isolation: Node has no contact with other nodes. If this warning persists, use the /usr/sbin/add_node_ip.py command to tell this node the address of another node in the grid. See the Recovery and Maintenance Guide for details.
Jan 23 12:39:10 dc1-sn1 journal: 2021-05-23 13:39:10 +0000 | dynip | Possible network isolation: Node has no contact with other nodes.

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.
Scan to view the article on your device