跳转到主内容

由于从交换机到目标的路径出现网络故障、在主机端观察到磁盘操作错误

Views:
29
Visibility:
Public
Votes:
0
Category:
ontap-9
Specialty:
san
Last Updated:

适用场景

  • ONTAP 9
  • Brocade交换机
  • AIX主机

问题描述

  • Disk operation error  和在主机端的Errpt log AIX主机上的下观察到的Path Failed 错误-

Errpt: - 
IDENTIFIER TIMESTAMP  T C RESOURCE_NAME  DESCRIPTION
DCB47997   0912191024 T H hdisk33     DISK OPERATION ERROR
F31FFAC3   0912191024 I H hdisk33     PATH HAS RECOVERED
DE3B8540   0912190924 P H hdisk33     PATH HAS FAILED
F31FFAC3   0912190924 I H hdisk38     PATH HAS RECOVERED
DCB47997   0912190824 T H hdisk38     DISK OPERATION ERROR
DE3B8540   0912190824 P H hdisk38     PATH HAS FAILED
DCB47997   0912190824 T H hdisk31     DISK OPERATION ERROR

 

  • 问题描述已自动恢复、但未从任何设备执行任何操作。
  • 在存储端、我们可以看到报告的CRCITW 错误与主机端的disk operation error 时间戳相关
    • ITW 丢弃帧时会报告错误。
    • 当数据或任何帧损坏时,报告循环冗余校验(CRC )错误。

ITW&CRC.png

  • 此外、我们还可以看到在问题描述期间EMS 中报告了WQE Errorsextended status 1d

日志记录—

Thu Sep 12 18:59:35 +0530 [NetApp-02: fct_tpd_work_thread_0: fcp.io.status:debug]: STIO Adapter:2b IO WQE failure, Handle 0x1, Type 8, S_ID: 66F240, VPI: 275, OX_ID: 1ECE, Status 0x3 Ext_Status 0x2
Thu Sep 12 19:01:06 +0530 [NetApp-02: fct_tpd_work_thread_0: fcp.io.status:debug]: STIO Adapter:2b IO WQE failure, Handle 0x1, Type 8, S_ID: 66F8C0, VPI: 275, OX_ID: 9AD, Status 0x3 Ext_Status 0x1d
Thu Sep 12 19:02:06 +0530 [NetApp-02: fct_tpd_work_thread_0: fcp.io.status:debug]: STIO Adapter:2b IO WQE failure, Handle 0x1, Type 8, S_ID: 66F240, VPI: 275, OX_ID: 2E76, Status 0x3 Ext_Status 0x1d


 

  • 在交换机端、存储和主机连接的交换机端口未报告物理层问题或错误。
  • 根据sfpshow 输出、SFP统计数据处于最佳范围-

=============
Slot 12/Port 18:
=============

RX Power:   -0.6   dBm (880.40uW)
TX Power:   0.4    dBm (1087.60 uW)

 

  • fabriclog 输出中未观察到端口翻盖。
  • 已验证ISL端口、但在中找不到任何错误errdump.
  • 没有为受影响的端口触发映射警报。
  • porterrshow 中报告的所有错误均为历史错误,因为在过去6个月中未清除端口统计信息。

fabos/bin/switchshow :
Index Slot Port Address Media  Speed     State   Proto
============================================================
242   12   18   66f240   id   N32    Online    FC  F-Port  10:00:00:10:9b:9e:xx:xx
 368   12   32   66f8c0   id   N32    Online    FC  F-Port  10:00:00:10:9b:9e:xx:xx

/fabos/cliexec/porterrshow :
      frames     enc    crc    crc    too    too    bad    enc   disc   link   loss   loss   frjt   fbsy    c3timeout    pcs    uncor
     tx     rx    in    err    g_eof   shrt   long   eof    out   c3    fail   sync   sig            tx    rx    err    err
242:  341.0g   71.2g   0     0     0     0     0     0     0    89     0     0     0     0     0     0     0     0     0
368:  341.0g   71.2g   0     0     0     0     0     0     0    81     0     0     0     0     0     0     0     0     0

 

 

  • 交换机上的主机或目标连接端口未报告ITW 错误或CRC 错误、这表示帧在到达目标之前即离开交换机后即损坏。
  • 目标(HHBA)按顺序接收帧,但序列中的一个或多个帧损坏(CRC ), NetApp NetApp的较低层会丢弃该帧,并且在 FCP/SCSI层上检测到NetApp上有一个或多个帧以WQE 错误的形式丢失。
  • 由于帧已损坏、并且主机未 从目标获得这些帧的响应或确认、因此主机将开始 报告 disk operation error.

 

  • 作为临时临时解决策、您可以禁用该端口、以确保在主机未自行执行路径故障转移的情况下、不会通过该路径传递任何帧。

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.