由于集群网络布线不当,集群交换机重启期间节点超出仲裁
适用于
- ONTAP 9
- 集群网络交换机
- 连接性、活跃性和可用性监视器 (CLAM)
问题描述
- 在集群交换机升级/重启期间,数据流量连接会短暂丢失。
- 节点数据 LIF 故障转移到另一个节点
- 查看 EMS 日志文件输出显示集群超出仲裁范围:
[Node-02: kltp: clam.node.ooq:EMERGENCY]: Node (name=Node-01, ID=1000) is out of "CLAM quorum" (reason=node in minority).
[Node-02: kltp: clam.node.ooq:EMERGENCY]: Node (name=Node-03, ID=1002) is out of "CLAM quorum" (reason=node in minority).- 受影响节点的两个群集端口同时关闭:
[Node-01: vifmgr: vifmgr.portdown:notice]: A link down event was received on node Node-01, port e0a.
[Node-01: vifmgr: vifmgr.clus.linkdown:EMERGENCY]: The cluster port e0a on node Node-01 has gone down unexpectedly.
[Node-01: vifmgr: vifmgr.portdown:notice]: A link down event was received on node Node-01, port e0b.
[Node-01: vifmgr: vifmgr.clus.linkdown:EMERGENCY]: The cluster port e0b on node Node-01 has gone down unexpectedly.
- 节点可能会遇到 CLAM 崩溃并重新启动:
[Node-01: gop_eq_thread: sk.panic:alert]: Panic String: Received PANIC packet from partner, receiving message is (Coredump and takeover initiated because Connectivity, Liveliness and Availability Monitor (CLAM) has determined this node is out of quorum.) in SK process gop_eq_thread on release 9.10.1P6 (C)
- 受影响节点的群集应用程序将脱机。