跳转到主内容

AF-A320:已禁用接管(非同步日志)、集群/HA端口上会出现暂停帧和Rx总线超限

Views:
1
Visibility:
Public
Votes:
0
Category:
ontap-9
Specialty:
hw<a>20097269s.</a>
Last Updated:

适用场景

  • AFF A320
  • Cisco集群交换机
  • ONTAP 9
  • 优先流量控制(PFC)

问题描述

  • 以下警报会 频繁显示在事件/EMS日志中:

Tue Oct 31 10:17:51 +0530 [Node-01: irq191: e0d: mirror.stream.qp.error:debug]: params: {'mirror': 'HA Partner', 'qp_name': 'WAFL', 'error': 'NVMM_ERR_MIRROR_COMPLETION'}
Tue Oct 31 10:17:51 +0530 [Node-01: irq191: e0d: mirror.stream.qp.error:debug]: params: {'mirror': 'HA Partner', 'qp_name': 'WAFL', 'error': 'NVMM_ERR_STREAM'}
Tue Oct 31 10:17:51 +0530 [Node-01: mcc_cfd_rnic: mirror.stream.qp.error:debug]: params: {'mirror': 'HA Partner', 'qp_name': 'RAID', 'error': 'NVMM_ERR_STREAM'}
Tue Oct 31 10:17:51 +0530 [Node-01: mcc_cfd_rnic: mirror.stream.qp.error:debug]: params: {'mirror': 'HA Partner', 'qp_name': 'MISC', 'error': 'NVMM_ERR_STREAM'}
Tue Oct 31 10:17:51 +0530 [Node-01: nvmm_error: nvmm.mirror.offlined:debug]: params: {'mirror': 'HA_PARTNER'}
Tue Oct 31 10:17:52 +0530 [Node-01: cf_main: cf.fsm.takeoverByPartnerDisabled:debug]: Failover monitor: takeover of Node-01 by Node-02 disabled (unsynchronized log).
Tue Oct 31 10:17:54 +0530 [Node-01: nvmm_mirror_sync: nvmm.mirror.state.change:debug]: mirror of sysid 1, partner_type HA Partner, changed state from NVMM_MIRROR_LAYOUT_SYNCING to NVMM_MIRROR_LAYOUT_SYNCED and took 1 msecs.
Tue Oct 31 10:17:54 +0530 [Node-01: nvmm_mirror_sync: nvmm.mirror.state.change:debug]: mirror of sysid 1, partner_type HA Partner, changed state from NVMM_MIRROR_LAYOUT_SYNCED to NVMM_MIRROR_SYNCING_START and took 0 msecs.
Tue Oct 31 10:17:54 +0530 [Node-01: nvmm_mirror_sync: nvmm.mirror.aborting:debug]: mirror of sysid 1, partner_type HA Partner and mirror state NVMM_MIRROR_SYNCING_START is aborted because of reason NVMM_ERR_STREAM_MAP.
Tue Oct 31 10:17:54 +0530 [Node-01: nvmm_error: nvmm.mirror.aborting:debug]: mirror of sysid 1, partner_type HA Partner and mirror state NVMM_MIRROR_OFFLINE is aborted because of reason NVMM_ABORT_SYNCING_MIRROR.
Tue Oct 31 10:17:55 +0530 [Node-01: ib_cm_13: rdma.rlib.connected:debug]: wafl:HA:A QP is now connected.
Tue Oct 31 10:17:55 +0530 [Node-01: ib_cm_13: rdma.rlib.connected:debug]: raid:HA:A QP is now connected.
Tue Oct 31 10:17:55 +0530 [Node-01: ib_cm_13: rdma.rlib.connected:debug]: misc:HA:A QP is now connected.
Tue Oct 31 10:17:55 +0530 [Node-01: ib_cm_12: rdma.rlib.connected:debug]: wafl:HA:A QP is now connected.
Tue Oct 31 10:17:55 +0530 [Node-01: ib_cm_12: rdma.rlib.connected:debug]: raid:HA:A QP is now connected.
Tue Oct 31 10:17:55 +0530 [Node-01: ib_cm_12: rdma.rlib.connected:debug]: misc:HA:A QP is now connected.
Tue Oct 31 10:17:55 +0530 [Node-01: nvmm_mirror_sync: nvmm.mirror.state.change:debug]: mirror of sysid 1, partner_type HA Partner, changed state from NVMM_MIRROR_SYNCING_START to NVMM_MIRROR_CP1_START and took 27 msecs.
Tue Oct 31 10:17:55 +0530 [Node-01: nvmm_mirror_sync: nvmm.mirror.state.change:debug]: mirror of sysid 1, partner_type HA Partner, changed state from NVMM_MIRROR_CP1_START to NVMM_MIRROR_WAFL_INIT and took 8 msecs.
Tue Oct 31 10:17:55 +0530 [Node-01: nvmm_mirror_sync: nvmm.mirror.state.change:debug]: mirror of sysid 1, partner_type HA Partner, changed state from NVMM_MIRROR_WAFL_INIT to NVMM_MIRROR_CP2_FINISH and took 15 msecs.
Tue Oct 31 10:17:56 +0530 [Node-01: nvmm_mirror_sync: nvmm.mirror.state.change:debug]: mirror of sysid 1, partner_type HA Partner, changed state from NVMM_MIRROR_CP2_FINISH to NVMM_MIRROR_WAFL_HEADER and took 426 msecs.
Tue Oct 31 10:17:56 +0530 [Node-01: nvmm_mirror_sync: nvmm.mirror.state.change:debug]: mirror of sysid 1, partner_type HA Partner, changed state from NVMM_MIRROR_WAFL_HEADER to NVMM_MIRROR_SYNCING_OTHER and took 12 msecs.
Tue Oct 31 10:17:56 +0530 [Node-01: nvmm_mirror_sync: nvmm.mirror.state.change:debug]: mirror of sysid 1, partner_type HA Partner, changed state from NVMM_MIRROR_SYNCING_OTHER to NVMM_MIRROR_ONLINE and took 133 msecs.
Tue Oct 31 10:17:56 +0530 [Node-01: nvmm_mirror_sync: nvmm.mirror.onlined:debug]: params: {'mirror': 'HA_PARTNER'}
Tue Oct 31 10:17:57 +0530 [Node-01: cf_main: cf.fsm.takeoverByPartnerEnabled:debug]: Failover monitor: takeover of Node-01 by Node-02 enabled

  • Ifstat 输出显示了暂停帧和总线超限:

::> system node run -node <node_name> -command ifstat <port>

  -- interface  e0a  (17 days, 9 hours, 10 minutes, 28 seconds) --        
  
  RECEIVE
  Total frames:   42258m | Frames/second:   28138  | Total bytes:     198t
  Bytes/second:    132m | Total errors:     0  | Errors/minute:     0 
  Total discards:   2906k | Discards/minute:   116  | Multi/broadcast: 16226k
  Non-primary u/c:    0  | Errored frames:    0  | Unsupported Op:    0 
  CRC errors:      0  | Runt frames:      0  | Fragment:       0 
  Long frames:      0  | Jabber:        0  | Length errors:     0 
  Alignment errors:   0  | No buffer:       0  | Pause:         0 
  Jumbo:       23587m | Error symbol:     0  | Bus overruns:    2906k
  Queue drops:      0  | LRO segments:   24581m | LRO bytes:      197t
  LRO6 segments:     0  | LRO6 bytes:      0  | Bad UDP cksum:     0 
  Bad UDP6 cksum:    0  | Bad TCP cksum:     0  | Bad TCP6 cksum:    0 
  Mcast v6 solicit:   0  | Lagg errors:      0  | Lacp errors:      0 
  Lacp PDU errors:    0 
  TRANSMIT
  Total frames:   34805m | Frames/second:   23176  | Total bytes:     123t
  Bytes/second:   82286k | Total errors:     0  | Errors/minute:     0 
  Total discards:    0  | Queue overflow:    0  | Multi/broadcast:  3130k
  Collisions:      0  | Pause:        523k | Jumbo:       32575m
  Cfg Up to Downs:    4  | TSO segments:    1744m | TSO bytes:      101t
  TSO6 segments:     0  | TSO6 bytes:      0  | HW UDP cksums:   1748k
  HW UDP6 cksums:    0  | HW TCP cksums:   25077m | HW TCP6 cksums:    0 
  Mcast v6 solicit:   0  | Lagg drops:      0  | Lagg no buffer:    0 
  Lagg no entries:    0 
  DEVICE
  Mcast addresses:    7  | Rx MBuf Sz:     9216 
  LINK INFO
  Speed:        100G | Duplex:       full | Flowcontrol:     none
  Media state:    active | Up to downs:      3 | HW assist:     5655 

  • 连接到控制器上的集群/HA端口的交换机端口上的PFC 将运行状态(Oper)显示 为"off":

SW-01# show interface priority-flow-control
slot  1
=======
============================================================
Port         Mode Oper(VL bmap)  RxPPP    TxPPP    
============================================================
Ethernet1/1     Auto Off       0      0      
Ethernet1/2     Auto Off       0      0      
Ethernet1/3     Auto Off       0      0      
Ethernet1/4     Auto Off       0      0      
Ethernet1/5     Auto Off       0      0      

SW-01# show interface priority-flow-control detail                                
Ethernet1/4                                                    
Admin Mode: Auto                                                 
Oper Mode: Off                                                  
VL bitmap:                                                    
Total Rx PFC Frames: 0                                              
Total Tx PFC Frames: 0                                              
---- ------------------------------------------------------------------------------------------------------------
|  Priority0  |  Priority1  |  Priority2  |  Priority3  |  Priority4  |  Priority5  |  Priority6  |  Priority7  |
-----------------------------------------------------------------------------------------------------------------
Rx  |0       |0       |0       |0       |0       |0       |0       |0     -----------------------------------------------------------------------------------------------------------------
Tx  |0       |0       |0       |0       |0       |0       |0       |0     
 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.