AIQUM事件:MetroCluster IP站点间连接状态为已关闭
适用场景
- Active IQ Unified Manager
- MergroCluster IP
- ONTAP 9
问题描述
- AIQUM警报:MetroCluster IP站点间连接状态为已关闭
- EMS日志:
Mon Mar 10 16:51:39 +0800 [node01: mccip_mirror_congestion_mgr_p: mcc.network.congestion:notice]: Network congestion detected. Action taken: Increased ic_timeout to 1200 msec.
Mon Mar 10 16:51:51 +0800 [node01: wafl_exempt22: mirror.stream.qp.error:debug]: params: {'mirror': 'DR PARTNER', 'qp_name': 'WAFL', 'error': 'NVMM_ERR_MIRROR_POLL_TIMEOUT'}
Mon Mar 10 16:51:51 +0800 [node01: wafl_exempt22: nvmm.mirror.aborting:debug]: mirror of sysid 2, partner_type DR PARTNER and mirror state NVMM_MIRROR_ONLINE is aborted because of reason NVMM_ERR_MIRROR_POLL_TIMEOUT.
Mon Mar 10 16:51:51 +0800 [node01: nvmm_error: mirror.stream.qp.error:debug]: params: {'mirror': 'DR PARTNER', 'qp_name': 'WAFL', 'error': 'NVMM_ERR_MIRROR_COMPLETION'}
Mon Mar 10 16:51:51 +0800 [node01: nvmm_error: ems.engine.suppressed:debug]: Event 'rdma.rlib.event.error' suppressed 6 times in last 2229 seconds.
Mon Mar 10 16:51:51 +0800 [node01: nvmm_error: rdma.rlib.event.error:debug]: QP wafl event error: client disconnect.
- 网络交换机端口 轻拍
Mar 8 06:38:15 2025 00290494 Ethernet1/2 100G UP SUCCESS(0x0)
Mar 8 06:33:39 2025 00854475 Ethernet1/10 100G UP SUCCESS(0x0)
Mar 8 06:23:04 2025 00226171 Ethernet1/10 ---- DOWN Link down debounce timer stopped and link is down
Mar 8 06:23:04 2025 00224929 Ethernet1/2 ---- DOWN Link down debounce timer stopped and link is down
Mar 8 06:23:04 2025 00123554 Ethernet1/10 ---- DOWN Link down debounce timer started(0x40e50006)
Mar 8 06:23:04 2025 00121922 Ethernet1/2 ---- DOWN Link down debounce timer started(0x40e50006)