跳转到主内容

由于收发器问题描述、"磁盘冗余失败"

Views:
Visibility:
Internal
Votes:
0
Category:
metrocluster
Specialty:
metrocluster
Last Updated:

适用场景

  • MetroCluster IP
  • Cisco后端交换机

问题描述

  1. 错误消息:

Tue Sep 03 04:51:31 +0200 [ClusterA-02: wafl_exempt09: mirror.stream.qp.error:debug]: params: {'mirror': 'DR PARTNER', 'qp_name': 'WAFL', 'error': 'NVMM_ERR_MIRROR_POLL_TIMEOUT'}Tue Sep 03 04:51:31 +0200 [ClusterA-02: wafl_exempt09: nvmm.mirror.aborting:debug]: mirror of sysid 2, partner_type DR PARTNER and mirror state NVMM_MIRROR_ONLINE is aborted because of reason NVMM_ERR_MIRROR_POLL_TIMEOUT.
Tue Sep 03 04:51:31 +0200 [ClusterA-02: nvmm_error: mirror.stream.qp.error:debug]: params: {'mirror': 'DR PARTNER', 'qp_name': 'WAFL', 'error': 'NVMM_ERR_MIRROR_COMPLETION'}
Tue Sep 03 04:51:31 +0200 [ClusterA-02: nvmm_error: ems.engine.suppressed:debug]: Event 'rdma.rlib.event.error' suppressed 11 times in last 263 seconds.
Tue Sep 03 04:51:31 +0200 [ClusterA-02: nvmm_error: rdma.rlib.event.error:debug]: QP wafl event error: client disconnect.
Tue Sep 03 04:51:31 +0200 [ClusterA-02: nvmm_error: nvmm.mirror.offlined:debug]: params: {'mirror': 'DR_PARTNER'}
Tue Sep 03 04:51:31 +0200 [ClusterA-02: DR_heartbeat_thread: cf.ic.xferTimedOut:error]: HA interconnect: MCC_DRSOM transfer timed out.

然后成功重试、例如:

Tue Sep 03 04:51:32 +0200 [ClusterA-02: iw_cm_wq: rdma.rlib.connected:debug]: wafl:DR:A QP is now connected.

  1. 大量错误消息与应用于多个不同磁盘(均为远程磁盘)的成功重试混合:

Tue Sep 03 04:51:34 +0200 [ClusterA-02: doneq0: scsi.mcc.adt.ioTransportError:error]: mcc_adt[2] - Transport error during execution of command: HA status 0x13: CAM transport status 0x1b: cdb 0x28:356b73b3:000d.
Tue Sep 03 04:51:34 +0200 [ClusterA-02: doneq0: scsi.mcc.adt.ioTransportError:error]: mcc_adt[2] - Transport error during execution of command: HA status 0x13: CAM transport status 0x1b : cdb 0x28:356b6555:000d....
Tue Sep 03 04:51:34 +0200 [ClusterA-02: scsi_cmdblk_strthr_admin: scsi.cmd.abortedByHost:error]: Disk device 0m.i1.2L17: Command aborted by host adapter: HA status 0x13: cdb 0x28:356b73b3:000d.
Tue Sep 03 04:51:34 +0200 [ClusterA-02: scsi_cmdblk_strthr_admin: scsi.cmd.abortedByHost:error]: Disk device 0m.i1.2L17: Command aborted by host adapter: HA status 0x13: cdb 0x28:356b6555:000d.
Tue Sep 03 04:51:34 +0200 [ClusterA-02: scsi_cmdblk_strthr_admin: scsi.cmd.retrySuccess:debug]: Disk device 0v.i1.0L17: request successful after retry #1/#0: cdb 0x28:356b73b3:000d (1967)

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

This is an internal KB article and its content should not be copy/pasted and shared with people outside of NetApp. Always seek Duty Manager authentication of caller for password reset requests. If you need further assistance post a question in Knowledge Xchange
NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.