跳转到主内容

FAS2820 上出现连续 irdma 消息导致 HA 故障

Views:
77
Visibility:
Public
Votes:
2
Category:
fas-systems
Specialty:
hw
Last Updated:

适用于

  • FAS2820
  • 无交换机集群

问题

  • HA(高可用性)连接已断开,以下消息正在刷屏
  • EMS 日志中:

nvmm_mirror_sync: nvmm.mirror.aborting:debug]: mirror of sysid 1, partner_type HA Partner and mirror state NVMM_MIRROR_LAYOUT_SYNCING is aborted because of reason NVPM_ERR_MSG_SEND_FAILED.    
nvmm_error: nvmm.mirror.aborting:debug]: mirror of sysid 1, partner_type HA Partner and mirror state NVMM_MIRROR_OFFLINE is aborted because of reason NVMM_ABORT_SYNCING_MIRROR.
cfdisk_config: cf.diskinventory.sendFailed:debug]: params: {'reason': 'HA Interconnect down', 'errorCode': '0'}   

  • SP-LATEST-CONSOLE-LOGS (AutoSupport) 中:

e0b:irdma_process_aeq:390 ERR AEQ: abnormal ae_id = 0x50a (Connection error: The max number of retries has been reached), is_qp = 1, qp_id = 4431, ae_source = 5
e0a:irdma_process_aeq:390 ERR AEQ: abnormal ae_id = 0x103 (Invalid memory key (L-Key/R-Key)), is_qp = 1, qp_id = 4241, ae_source = 9
e0b:irdma_process_aeq:390 ERR AEQ: abnormal ae_id = 0x208 (QP error: Invalid operation detected by the remote peer), is_qp = 1e0a:irdma_process_aeq:390 ERR AEQ: abnormal ae_id = 0x208 (QP error: Invalid operation detected by the remote peer), is_qp = 1, qp_id = 4549, ae_source = 5

e0a:irdma_process_aeq:380 ERR abnormal ae_id = 0x50a bool qp=1 qp_id = 36  ae_source=5
e0b:irdma_process_aeq:380 ERR abnormal ae_id = 0x50a bool qp=1 qp_id = 47  ae_source=5
e0a:irdma_process_aeq:380 ERR abnormal ae_id = 0x50a bool qp=1 qp_id = 46  ae_source=5

  • HA-INTERCONNECT-STATUS(AutoSupport)中:

Link Status    
Link 0 Status    up
Link 1 Status    up
IC RDMA Connection    down
Is Link 0 Active    true
Is Link 1 Active    true

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.