跳转到主内容

机箱中多个刀片式服务器上的FCP目标和路径丢失

Views:
3
Visibility:
Public
Votes:
0
Category:
ontap-9
Specialty:
san<a>2009400056</a>
Last Updated:

适用场景

  • ONTAP 9
  • HPE Synergy
  • Brocade Fabric OS 9.1
  • VMware ESXi

问题描述

  • 不同ESXi主机(HPE Synergy机箱中的刀片式服务器)上的零星和间歇性路径和目标丢失
  • ONTAP 确认相关启动程序未登录。受影响的启动程序和LIF可能每隔几分钟更改一次。

::*> fcp ping-igroup show -vserver SVM -igroup * -ext-status wwpn-not-logged_in
  (vserver fcp ping-igroup show)
      Igroup            Logical   Node    Ping    Extended
Vserver   Name     WWPN       Interface  Name    Status   Status
--------- ----------- -------------- ---------- --------- -------- -----------
SVM
      SYNERGYESXGRP1 20:00:xx:xx:xx:xx:xx:29 SVM_fc07 NODEA12 reachable wwpn-not-logged_in
      SYNERGYESXGRP2 20:00:xx:xx:xx:xx:xx:09 SVM_fc07 NODEA12 reachable wwpn-not-logged_in
      SYNERGYESXGRP4 20:00:xx:xx:xx:xx:xx:01 SVM_fc02 NODEA11 reachable wwpn-not-logged_in
      SYNERGYESXGRP4 20:00:xx:xx:xx:xx:xx:01 SVM_fc04 NODEA12 reachable wwpn-not-logged_in
      SYNERGYESXGRP7 20:00:xx:xx:xx:xx:xx:31 SVM_fc02 NODEA11 reachable wwpn-not-logged_in
      SYNERGYESXGRP8 20:00:xx:xx:xx:xx:xx:31 SVM_fc04 NODEA12 reachable wwpn-not-logged_in

  • 有时、启动程序会确认为已登录、但相关主机仍会错过目标并具有失效路径
  • Ext_Status 0x16在EMS /中出现错误的wqe event log show

fcp.io.status: STIO Adapter:2a IO WQE failure, Handle 0x5, Type 8, S_ID: 10902, VPI: 259, OX_ID: 24C, Status 0x3 Ext_Status 0x16

  • 由于 command termination hung SRAM转储(可能在多个存储控制器上、但不一定在多个存储控制器上)、FC主机总线目标适配器重置

::> event log show -severity debug -event *fcp.io.status*hung*|*SRAM*
Time         Node        Severity    Event
------------------- ---------------- ------------- ---------------------------
12/21/2022 12:44:48 NODEA12      DEBUG      scsitarget.fcp.dump: FCP target SRAM dump generated for adapter 2a, fct_tpd_check_hung_commands: Command termination hung. cmd:0xfffff80917a41c60 (state=0xa, flags=0x2,ctio_sent=2/2, RecvExAddr=0x1aec, OX_ID=0x72, RX_ID=0xffff, SID=0x10902)
12/21/2022 12:44:48 NODEA12      DEBUG     fcp.io.status: STIO Adapter:2a, found hung cmd:0xfffff80917a41c60(state=10, flags=0x2, ctio_sent=2/2,RecvExAddr=0x1aec, OX_ID=0x72, RX_ID=0xffff,SID=0x10902, Cmd[28], req_q_free:0)
12/21/2022 11:56:38 NODEA12      DEBUG      fcp.io.status: STIO Adapter:1a, found hung cmd:0xfffff8090d1a4010(state=5, flags=0x0, ctio_sent=1/1,RecvExAddr=0x14ef, OX_ID=0x264, RX_ID=0xffff,SID=0x10902, Cmd[2A], req_q_free:0)
12/21/2022 11:55:51 NODEA11      DEBUG      scsitarget.fcp.dump: FCP target SRAM dump generated for adapter 2a, fct_tpd_check_hung_commands: Command termination hung. cmd:0xfffff80917b392f8 (state=0xa, flags=0x2,ctio_sent=2/3, RecvExAddr=0x146e, OX_ID=0x178, RX_ID=0xffff, SID=0x10c03)
12/21/2022 11:55:46 NODEA11      DEBUG      fcp.io.status: STIO Adapter:2a, found hung cmd:0xfffff80917b392f8(state=7, flags=0x0, ctio_sent=1/2,RecvExAddr=0x146e, OX_ID=0x178, RX_ID=0xffff,SID=0x10c03, Cmd[8A], req_q_free:0)
5 entries were displayed.

  • 在Brocade交换机上、 c3timeout主机端口的TX和存储端口的Rx上增加表示终端设备拥塞
  1. 运行 statsclear
  2. 等待15分钟
  3. 运行 porterrshow 

示例

FID128:admin> porterrshow 8-13
      frames     enc    crc    crc    too    too    bad    enc   disc   link   loss   loss   frjt   fbsy   c3timeout   pcs    uncor
     tx     rx    in    err    g_eof   shrt   long   eof    out   c3    fail   sync   sig            tx   rx    err    err
  8:   4.8m   8.4m   0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
  9:   1.1k   1.1k   0     0     0     0     0     0     0    717     0     0     0     0     0   717    0     0     0
  10:   0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
  11:   0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
  12:   3.9m   6.8m   0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
  13:  976     1.2k   0     0     0     0     0     0     0    840     0     0     0     0     0   840    0     0     0
FID128:admin> porterrshow 20-23
      frames     enc    crc    crc    too    too    bad    enc   disc   link   loss   loss   frjt   fbsy   c3timeout   pcs    uncor
     tx     rx    in    err    g_eof   shrt   long   eof    out   c3    fail   sync   sig            tx    rx   err    err
  20:   6.7m   7.6m   0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
  21:   3.9m   2.5m   0     0     0     0     0     0     0    518     0     0     0     0     0     0   259    0     0
  22:   3.9m   2.5m   0     0     0     0     0     0     0    974     0     0     0     0     0     0   487    0     0
  23:   6.4m   3.0m   0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.