跳转到主内容

由于网络故障, StorageGRID 节点连接状态未知 适配器

Views:
45
Visibility:
Public
Votes:
0
Category:
storagegrid
Specialty:
sgrid
Last Updated:

适用场景

  • NetApp 存储网格
  • 基于裸机的存储节点

问题描述

  • 网格管理器界面中的存储节点连接状态未知:
选择 Nodes > 选择相关节点 > Overview
connection-state-unknown.png
  • servermanager.log 表示存在网络问题描述:
2021-01-23 12:39:10 +0000 | dynip | Possible network isolation: Node has no contact with other nodes.
 
  • 基本操作系统消息日志显示 有关 i40e 的错误而 bond0 的所有接口 均已关闭:
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.0: HMC error interrupt
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.0: HMC error info 0x80000090, HMC error data 0x0
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.1: unhandled interrupt icr0=0x00010000
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.0: unhandled interrupt icr0=0x00010000
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.0: device will be reset
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.1: device will be reset
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.1: VSI seid 393 Tx ring 128 disable timeout
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.1: VSI seid 393 Rx ring 128 disable timeout
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.0: VSI seid 390 Tx ring 0 disable timeout
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.0: VSI seid 390 Rx ring 0 disable timeout
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.0: VSI seid 392 Tx ring 128 disable timeout
Jan 23 12:33:41 dc1-sn1 kernel: i40e 0000:1a:00.0: VSI seid 392 Rx ring 128 disable timeout
Jan 23 12:33:41 dc1-sn1 kernel: bond0: link status definitely down for interface eno1, disabling it
Jan 23 12:33:41 dc1-sn1 kernel: device eno1 left promiscuous mode
Jan 23 12:33:41 dc1-sn1 kernel: bond0: now running without any active interface!
Jan 23 12:33:41 dc1-sn1 kernel: bond0: link status definitely down for interface eno2, disabling it
Jan 23 12:33:57 dc1-sn1 kernel: i40e 0000:1a:00.1: PF reset failed, -15
Jan 23 12:33:57 dc1-sn1 kernel: i40e 0000:1a:00.0: PF reset failed, -15
Jan 23 12:34:01 dc1-sn1 kernel: i40e 0000:1a:00.1: Rebuild AdminQ failed, err I40E_ERR_ADMIN_QUEUE_TIMEOUT aq_err OK
Jan 23 12:34:01 dc1-sn1 kernel: i40e 0000:1a:00.0: Rebuild AdminQ failed, err I40E_ERR_ADMIN_QUEUE_TIMEOUT aq_err OK
Jan 23 12:34:01 dc1-sn1 kernel: i40e 0000:1a:00.0: ignoring delete macvlan error on PF, err I40E_ERR_QUEUE_EMPTY, aq_err OK
Jan 23 12:34:17 dc1-sn1 kernel: i40e 0000:1a:00.1: PF reset failed, -15
Jan 23 12:34:17 dc1-sn1 kernel: i40e 0000:1a:00.0: PF reset failed, -15
...
Jan 23 12:39:10 dc1-sn1 journal: Possible network isolation: Node has no contact with other nodes. If this warning persists, use the /usr/sbin/add_node_ip.py command to tell this node the address of another node in the grid. See the Recovery and Maintenance Guide for details.
Jan 23 12:39:10 dc1-sn1 journal: 2021-05-23 13:39:10 +0000 | dynip | Possible network isolation: Node has no contact with other nodes.

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

Scan to view the article on your device