在 AFF A800 上无法执行集群网络降级警报和接管
适用场景
- A800
- X1146A T62100-CR
问题描述
- 接收每日 " 集群网络已降级 " 警报
[cluster-01: vifmgr: callhome.clus.net.degraded:alert]: Call home for CLUSTER NETWORK DEGRADED: Total Packet Loss - Ping failures detected between cluster-01_clus2 ( 169.254.32.8 ) on cluster-01 and cluster-02_clus1 ( 169.254.99.167 ) on cluster-02
- 集群还会触发有关未同步 NVRAM 日志导致接管被禁用的警报
[cluster-01: statd: cf.takeover.disabled:alert]: HA mode, but takeover of partner is disabled due to reason : unsynchronized log.
- 在 EMS 日志中,我们会看到以下消息
[cluster-01: nvmm_mirror_sync: nvmm.mirror.aborting:debug]: mirror of sysid 1, partner_type HA Partner and mirror state NVMM_MIRROR_LAYOUT_SYNCING is aborted because of reason NVPM_ERR_MSG_SEND_FAILED.
[cluster-01: nvmm_error: nvmm.mirror.aborting:debug]: mirror of sysid 1, partner_type HA Partner and mirror state NVMM_MIRROR_OFFLINE is aborted because of reason NVMM_ABORT_SYNCING_MIRROR.
[cluster-01: nvmm_helper: nvpm.state.changed:debug]: Node 1's NVPM state changed from "2" to "2".
- 在看到以下消息后,这些警报将开始触发
[cluster-01: intr: netif.fatal.err:alert]: The network device in slot 1 encountered fatal error e1a/e1b.