跳转到主内容

由于高可用性互连日志传输 "Send queue 的 QP WAFL 已满 " 条件

Views:
42
Visibility:
Public
Votes:
0
Category:
fas-systems<a>2008530728</a>
Specialty:
hw
Last Updated:

适用场景

  • AFF8040、AFF A300、AFF A220、AFF A200、AFF C190
  • FAS8200、FAS2750、FAS2720、FAS2650、FAS2620
  • ONTAP 9
  • Cloud Volumes ONTAP ( CVO )
  • 适用于 NetApp ONTAP 的 Amazon FSX

问题描述

  • 较高的CPU利用率由WAFL_TALENK域驱动。 
  • EMS事件日志中显示以下错误消息:

Sun Nov 01 02:14:22 KST [node: wafl_exempt10: rdma.rlib.queue.full:notice]: Send queue of QP wafl is full.
Sun Nov 01 02:14:22 KST [node: wafl_exempt10: nvmm.mirror.aborting:debug]: mirror of sysid 1, partner_type HA Partner and mirror state MIRROR_ONLINE is aborted because of reason Abort Pending.
Sun Nov 01 02:14:22 KST [node: ib_cm_wq: ems.engine.suppressed:debug]: Event 'ic.rdma.qpDisconnected' suppressed 18 times in last 33933 seconds.
Sun Nov 01 02:14:22 KST [node: ib_cm_wq: ic.rdma.qpDisconnected:debug]: wafl is disconnected.
Sun Nov 01 02:14:22 KST [node: nvram_sync: nvmm.mirror.offlined:debug]: params: {'mirror': 'HA Partner Mirror Offlined'}
Sun Nov 01 02:14:22 KST [node: rastrace_dump: rastrace.dump.saved:debug]: A RAS trace dump for module IC instance 0 was stored in /etc/log/rastrace/IC_0_20201101_02:14:22:671353.dmp.
Sun Nov 01 02:14:22 KST [node: ib_cm_wq: ems.engine.suppressed:debug]: Event 'ic.rdma.qpConnected' suppressed 18 times in last 33933 seconds.
Sun Nov 01 02:14:22 KST [node: ib_cm_wq: ic.rdma.qpConnected:debug]: wafl is connected.
Sun Nov 01 02:14:22 KST [node: ib_cm_wq: rdma.rlib.connected:debug]: wafl QP is now connected.
Sun Nov 01 02:14:22 KST [node: ib_cm_wq: rdma.rlib.connected:debug]: raid QP is now connected.
Sun Nov 01 02:14:22 KST [node: ib_cm_wq: rdma.rlib.connected:debug]: misc QP is now connected.
Sun Nov 01 02:14:23 KST [node: cf_main: cf.fsm.takeoverOfPartnerDisabled:error]: Failover monitor: takeover of node disabled (unsynchronized log).
Sun Nov 01 02:14:23 KST [node: cf_main: cf.fsm.takeoverByPartnerDisabled:error]: Failover monitor: takeover of node by node disabled (unsynchronized log).
Sun Nov 01 02:14:31 KST [node: ctlg_flxlg_mirror: nvmm.mirror.aborting:debug]: mirror of sysid 1, partner_type HA Partner and mirror state MIRROR_SYNCING_OTHER is aborted because of reason Abort Pending.
Sun Nov 01 02:14:34 KST [node: wafl_exempt14: nvmm.mirror.aborting:debug]: mirror of sysid 1, partner_type HA Partner and mirror state MIRROR_SYNCING_CP1_START is aborted because of reason Abort Pending.
Sun Nov 01 02:14:56 KST [node: nvram_sync: nvmm.mirror.onlined:debug]: params: {'mirror': 'HA Partner Mirror Onlined'}
Sun Nov 01 02:14:57 KST [node: cf_main: cf.fsm.takeoverByPartnerEnabled:notice]: Failover monitor: takeover of node by node enabled
Sun Nov 01 02:15:00 KST [node: monitor: monitor.globalStatus.critical:EMERGENCY]: Controller failover of node is not possible: unsynchronized log.
Sun Nov 01 02:15:01 KST [node: cf_main: cf.fsm.takeoverOfPartnerEnabled:notice]: Failover monitor: takeover of node enabled

注意:本文不适用于使用FCVI互连的MetroCluster 系统。

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.