由于集群网络中断、节点的状态达到了无状态仲裁
适用场景
- ONTAP 9
- 从NetApp购买的Broadcom BES-53248交换机
问题描述
- 其中一个节点上发生夹具崩溃。
PANIC : Received PANIC packet from partner, receiving message is (Coredump and takeover initiated because Connectivity, Liveliness and Availability Monitor (CLAM) has determined this node is out of quorum.)
- 崩溃节点(FAS8200上的集群端口以10 G速度连接到BES-53248交换机
- FAS8700节点上的集群端口以100G速度连接到BES-53248交换机
vifmgr.cluscheck.ctdpktloss
在发生崩溃之前会观察到大量警报
[?] Mon Aug 22 12:49:18 +0800 [PERPIS-02: vifmgr: vifmgr.cluscheck.ctdpktloss:alert]: Continued packet loss when pinging from cluster lif PERPIS-02_clus1 (node PERPIS-02) to cluster lif PERPIS-01_clus1 (node PERPIS-01).
[?] Mon Aug 22 12:49:40 +0800 [PERPIS-02: vifmgr: vifmgr.cluscheck.ctdpktloss:alert]: Continued packet loss when pinging from cluster lif PERPIS-02_clus2 (node PERPIS-02) to cluster lif PERPIS-13_clus1 (node PERPIS-10).
[?] Mon Aug 22 12:50:46 +0800 [PERPIS-02: vifmgr: vifmgr.cluscheck.ctdpktloss:alert]: Continued packet loss when pinging from cluster lif PERPIS-02_clus2 (node PERPIS-02) to cluster lif PERPIS-12_clus2 (node PERPIS-09).
[?] Mon Aug 22 12:51:08 +0800 [PERPIS-02: vifmgr: vifmgr.cluscheck.ctdpktloss:alert]: Continued packet loss when pinging from cluster lif PERPIS-02_clus2 (node PERPIS-02) to cluster lif PERPIS-11_clus1 (node PERPIS-08).
[?] Mon Aug 22 12:51:30 +0800 [PERPIS-02: vifmgr: vifmgr.cluscheck.ctdpktloss:alert]: Continued packet loss when pinging from cluster lif PERPIS-02_clus2 (node PERPIS-02) to cluster lif PERPIS-11_clus2 (node PERPIS-08).
[?] Mon Aug 22 12:51:52 +0800 [PERPIS-02: vifmgr: vifmgr.cluscheck.ctdpktloss:alert]: Continued packet loss when pinging from cluster lif PERPIS-02_clus2 (node PERPIS-02) to cluster lif PERPIS-10_clus1 (node PERPIS-07).
[?] Mon Aug 22 12:52:36 +0800 [PERPIS-02: vifmgr: vifmgr.cluscheck.ctdpktloss:alert]: Continued packet loss when pinging from cluster lif PERPIS-02_clus2 (node PERPIS-02) to cluster lif PERPIS-04_clus1 (node PERPIS-04).
[?] Mon Aug 22 12:53:42 +0800 [PERPIS-02: vifmgr: vifmgr.cluscheck.ctdpktloss:alert]: Continued packet loss when pinging from cluster lif PERPIS-02_clus2 (node PERPIS-02) to cluster lif PERPIS-03_clus2 (node PERPIS-03).
[?] Mon Aug 22 12:54:26 +0800 [PERPIS-02: vifmgr: vifmgr.cluscheck.ctdpktloss:alert]: Continued packet loss when pinging from cluster lif PERPIS-02_clus2 (node PERPIS-02) to cluster lif PERPIS-06_clus2 (node PERPIS-06).
[?] Mon Aug 22 12:54:49 +0800 [PERPIS-02: vifmgr: vifmgr.cluscheck.ctdpktloss:alert]: Continued packet loss when pinging from cluster lif PERPIS-02_clus2 (node PERPIS-02) to cluster lif PERPIS-05_clus1 (node PERPIS-05).
- 与FAS8200节点关联的软件端口具有大量的OutDropPkts和InDropPkts帐户
Port OutOctets OutUcastPkts OutMcastPkts OutBcastPkts OutDropPkts Tx Error
--------- ---------------- ---------------- ---------------- ---------------- ---------------- ----------------
0/1 135350867555820 30454105411 10640954 1174996 8160 0
0/2 1115535865614836 153716964951 10640974 1155040 244502 0
0/3 555253727068245 114123567719 10640914 1171842 19740 0
0/4 110609817174084 27294584348 10640974 1175446 7 0
0/5 324792975891383 52311649051 10640974 1175338 14560 0
0/6 4499554844182 7764219226 10640974 1176497 0 0
Port InOctets InUcastPkts InMcastPkts InBcastPkts InDropPkts Rx Error
--------- ---------------- ---------------- ---------------- ---------------- ---------------- ----------------
0/1 184812820859850 31934228208 760131 26093 0 0
0/2 236905412197034 81900354370 760111 46049 1 0
0/3 828708576199875 126752545891 760107 29233 167068 0
0/4 259526455231231 38692873350 760111 25643 124437 0
0/5 200243275302091 40098210067 760110 25751 0 0
0/6 82902559324186 12919563869 760110 24592 0 0
- 通过RCF1.6在BES-53248交换机上设置默认QoS