CVO Azure:无法对节点执行控制器故障转移:重复的未同步日志
- Views:
- 5
- Visibility:
- Public
- Votes:
- 0
- Category:
- cloud-volumes-ontap-cvo
- Specialty:
- cloud<a>2009739296</a>
- Last Updated:
适用场景
- Cloud Volumes ONTAP (CVO)
- Microsoft Azure
- 未同步日志
- 输入/输出(IO)
问题描述
Mon Oct 02 04:24:50 -0400 [cvo-01: cf_main: cf.fsm.takeoverOfPartnerDisabled:error]: Failover monitor: takeover of cvo-02 disabled (unsynchronized log).
Mon Oct 02 04:24:53 -0400 [cvo-01: cf_main: cf.fsm.takeoverOfPartnerEnabled:notice]: Failover monitor: takeover of cvo-02 enabled
EMS.LOG.GZ 报告 底层QPS收到poll _超时错误。
Tue Sep 05 17:50:07 -0400 [cvo-02: mcc_cfd_rnic: nvmm.mirror.aborting:debug]: mirror of sysid 1, partner_type HA Partner and mirror state NVMM_MIRROR_ONLINE is aborted because of reason NVMM_ERR_POLL_TIMEOUT.
Tue Sep 05 17:50:07 -0400 [cvo-02: mcc_cfd_rnic: mirror.stream.qp.error:debug]: params: {'mirror': 'HA Partner', 'qp_name': 'RAID', 'error': 'NVMM_ERR_POLL_TIMEOUT'}
Tue Sep 05 17:50:07 -0400 [cvo-02: mcc_cfd_rnic: mirror.stream.qp.error:debug]: params: {'mirror': 'HA Partner', 'qp_name': 'MISC', 'error': 'NVMM_ERR_STREAM'}
Tue Sep 05 17:50:07 -0400 [cvo-02: nvmm_error: mirror.stream.qp.error:debug]: params: {'mirror': 'HA Partner', 'qp_name': 'WAFL', 'error': 'NVMM_ERR_MIRROR_COMPLETION'}
Tue Sep 05 17:50:07 -0400 [cvo-02: nvmm_error: mirror.stream.qp.error:debug]: params: {'mirror': 'HA Partner', 'qp_name': 'WAFL', 'error': 'NVMM_ERR_STREAM'}
Tue Sep 05 17:50:07 -0400 [cvo-02: nvmm_error: ems.engine.suppressed:debug]: Event 'rdma.rlib.event.error' suppressed 2 times in last 2754 seconds.
Tue Sep 05 17:50:07 -0400 [cvo-02: nvmm_error: rdma.rlib.event.error:debug]: QP wafl event error: client disconnect.
Tue Sep 05 17:50:07 -0400 [cvo-02: nvmm_error: nvmm.mirror.offlined:debug]: params: {'mirror': 'HA_PARTNER'}
Tue Sep 05 17:50:07 -0400 [cvo-02: rastrace_dump: rastrace.dump.saved:debug]: A RAS trace dump for module IC instance 0 was stored in /etc/log/rastrace/IC_0_20230905_17:50:07:464435.dmp.
Tue Sep 05 17:50:07 -0400 [cvo-02: cf_main: cf.fsm.takeoverByPartnerDisabled:error]: Failover monitor: takeover of cvo-02 by cvo-01 disabled (unsynchronized log).
SKTRACE.GZ 日志指示RAID IO完成所需时间较长。
2023-09-06T19:15:32Z 3435898944123351 [14:0] BSD_RLIB_WARN: rlib_update_comp_info: qp wafl wr_id 300000000000000000 num_of_req 2 op_type 1 op_size 256 took time 2321466 usecs pending_size:2184520
2023-09-06T19:15:32Z 3435898944131159 [14:0] BSD_RLIB_WARN: rlib_update_comp_info: qp wafl wr_id 310000000000000000 num_of_req 2 op_type 1 op_size 344 took time 2320541 usecs pending_size:2184176
2023-09-06T19:15:32Z 3435898944168065 [14:0] BSD_RLIB_WARN: rlib_update_comp_info: qp wafl wr_id 312000000000000000 num_of_req 2 op_type 1 op_size 4160 took time 2313630 usecs pending_size:2180016