CONAP-400019:NVRAM单节点报告"QP WAFL的发送队列已满"时出现"QP日志未同步"SFO"错误
问题描述
- HA对中的两个节点偶尔会报告"HA NVRAM日志未同步"消息。示例:
::> storage failover show -field reason
node reason
----------- ----------------------------
node_name-1 "NVRAM log not synchronized"
node_name-2 "NVRAM log not synchronized"
2 entries were displayed.
- ONTAP事件消息系统报告:
Fri Feb 14 12:55:32 +0100 [node_name-1: wafl_exempt03: rdma.rlib.queue.full:notice]: Send queue of QP WAFL is full.
...
Fri Feb 14 12:55:33 +0100 [node_name-1: cf_main: cf.fsm.takeoverOfPartnerDisabled:error]: Failover monitor: takeover of node_name-2 disabled (unsynchronized log).
Fri Feb 14 12:55:33 +0100 [node_name-1: cf_main: cf.fsm.takeoverByPartnerDisabled:error]: Failover monitor: takeover of node_name-1 by node_name-2 disabled (unsynchronized log).
cleared after a few seconds:
Fri Feb 14 12:55:50 +0100 [node_name-1: cf_main: cf.fsm.takeoverByPartnerEnabled:notice]: Failover monitor: takeover of node_name-1 by node_name-2 enabled
- 配对节点中的等效消息:
Fri Feb 14 12:55:34 +0100 [node_name-2: cf_main: cf.fsm.takeoverOfPartnerDisabled:error]: Failover monitor: takeover of node_name-1 disabled (unsynchronized log).
Fri Feb 14 12:55:34 +0100 [node_name-2: cf_main: cf.fsm.takeoverByPartnerDisabled:error]: Failover monitor: takeover of node_name-2 by node_name-1 disabled (unsynchronized log).
...
Fri Feb 14 12:55:43 +0100 [node_name-2: cf_main: cf.fsm.takeoverByPartnerEnabled:notice]: Failover monitor: takeover of node_name-2 by node_name-1 enabled
Fri Feb 14 12:55:51 +0100 [ClusterA-02: cf_main: cf.fsm.takeoverOfPartnerEnabled:notice]: Failover monitor: takeover of node_name-1 enabled
- 互连端口和连接未发现错误或问题