跳转到主内容

RoCE端口上的LIF意外关闭

Views:
9
Visibility:
Public
Votes:
0
Category:
ontap-9
Specialty:
core
Last Updated:

适用场景

  • ONTAP 9.13.1.及更高版本
  • 基于RDMA/RoCE的NFS
  • Mellanox/NVIDIA CX5/CX6/CX6-LX 1025GbE或40/100GbE NIC

问题描述

  • 如果一个支持RoCE的 端口上已有127个以上的NFS数据LIF:
    • LIF故障转移或迁移 可能会导致LIF  无误地进入运行中断状态
    • LIF创建 成功、但 LIF 操作已关闭 、并且vifmgr.log中记录了错误
clustershell::> network interface create -vserver vs0 -lif vs0_test -service-policy default-data-files -address 10.75.140.127 -netmask 255.255.255.0 -home-node node-02 -home-port e4a Info: LIF "vs0_test" on Vserver "vs0" was created successfully but could not be successfully configured on either its home port or any of its failover targets. The LIF's operational status will be reported as "down" until one or more failover targets becomes available. Use the "network interface show -vserver vs0 -lif vs0_test -failover" command to review the LIF's current failover configuration.
  • vifmgr.log中的错误

示例: 

(03/26/2024 16:41:03): > [Net::LifStackAdapter::installLif] vserverId=3, lifId=1278, address=10.95.86.122, portName=e3a, lifProtocols=0x1 (03/26/2024 16:41:03): > [SkStackMgr::addLif] PARAM lifId 1278, portName e3a, address 10.95.86.122, ipspaceId 4294967295, vserverId 3, lifUuid 98cd9a48-ea28-11ee-ad09-d039eaa9ecf3, isMccRequest false, lifProtocols 0x001, serviceMask 0x000000013D000804, homeNode perfqa-vino-03 (03/26/2024 16:41:03): > [NbladeWriter::addLif] PARAM: lifId: 1278, address 10.95.86.122, netmask 255.255.255.0, ipspaceId: 4294967295, vserverId: 3, portName: e3a, isMccRequest: false, protocolMask: 00000001, serviceMask: 0x000000013D000804, homeNode^I: perfqa-vino-03(ccdcca33-ea25-11ee-ad09-d039eaa9ecf3) (03/26/2024 16:41:03): > [NbladeWriter::nitroPcpRpcCall] procNum=3, isIdemp=false (03/26/2024 16:41:03): > [DelayTracker::add_sample] ENTRY: object=nblade, delay_ms=53 (03/26/2024 16:41:03): < [DelayTracker::add_sample] EXIT: object=nblade, state=NORMAL (03/26/2024 16:41:03): < [NbladeWriter::nitroPcpRpcCall] elapsed time: 0s) (03/26/2024 16:41:03): [NbladeWriter::ScopedNitroRequest::sendRequest] RPC for procedure 3 completed, but returned error: NbladeWriter Error type unknown: 12046 (03/26/2024 16:41:03): < [NbladeWriter::ScopedNitroRequest::sendRequest] (03/26/2024 16:41:03): < [NbladeWriter::addLif] retval: NbladeWriter Error type unknown: 12046 (03/26/2024 16:41:03): [SkStackMgr::addLif] Unexpected error adding the LIF to the stack: NbladeWriter Error type unknown: 12046 (03/26/2024 16:41:03): < [SkStackMgr::addLif] complete, returning Unexpected error "NbladeWriter Error type unknown: 12046" encountered as a result of adding the LIF. (03/26/2024 16:41:03): [Net::LifStackAdapter::installLif] Failed to add the requested LIF: Unexpected error "NbladeWriter Error type unknown: 12046" encountered as a result of adding the LIF. (03/26/2024 16:41:03): [Net::AbortableHandle::commit] Caught an unexpected exception: Unexpected error "NbladeWriter Error type unknown: 12046" encountered as a result of adding the LIF. (03/26/2024 16:41:03): ERR{ commit() at src/framework/objects/base/AbortableHandle.cc:65 }
  • 端口位于  具有RoCE卸载功能的NIC上(例如、Mellanox/NVIDIA CX5/CX6/CX6-LX) 

示例:

::> network port show -node node-02 -fields rdma-protocols node port rdma-protocols -------- ---- -------------- node-02 e0M - node-02 e1a roce node-02 e1b roce node-02 e3a roce node-02 e3b roce node-02 e3c roce node-02 e3d roce 7 entries were displayed.
  •  NFS服务器已 启用RDMA (在ONTAP 9.10.1及更高版本中为默认设置)

注意: 要确定此问题、请使用 vserver nfs show -fields rdma

 

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.