MetroCluster IP 远程站点多个磁盘失败
适用于
- ONTAP 9
- MetroCluster
问题
- 已在群集交换机的 MetroCluster IP 端口上禁用流量控制。
- 多磁盘故障事件:报告来自 Cluster1-1a 的 HA 组通知(文件系统磁盘未响应)错误。
- 我们可以在集群中看到以下错误
NV 镜像在集群网络降级警报前几秒钟离线Mon Sep 11 15:03:37 +1000 [Cluster1-1a: nvmm_error: nvmm.mirror.offlined:debug]: params: {'mirror': 'HA_PARTNER'}
Mon Sep 11 15:03:37 +1000 [Cluster1-1a: nvmm_error: nvmm.mirror.offlined:debug]: params: {'mirror': 'DR_PARTNER'}
Mon Sep 11 15:03:45 +1000 [Cluster1-1a: vifmgr: vifmgr.port.monitor.failed:debug]: The "link_flapping" health check for port e0c (node Cluster1-1a) has failed. The port is operating in a degraded state.
Mon Sep 11 15:03:45 +1000 [Cluster1-1a: vifmgr: callhome.clus.net.degraded:debug]: Call home for CLUSTER NETWORK DEGRADED: Frequent Link Flapping - Cluster port e0c on node Cluster1-1a has experienced multiple link down notification
NV 镜像状态在一段时间后更改为在线
Mon Sep 11 15:15:44 +1000 [Cluster1-1a: nvmm_mirror_sync: nvmm.mirror.state.change:debug]: mirror of sysid 2, partner_type DR PARTNER, changed state from NVMM_MIRROR_SYNCING_OTHER to NVMM_MIRROR_ONLINE and took 1684 msecs.
Mon Sep 11 15:17:09 +1000 [Cluster1-1a: nvmm_mirror_sync: nvmm.mirror.state.change:debug]: mirror of sysid 2, partner_type DR PARTNER, changed state from NVMM_MIRROR_SYNCING_OTHER to NVMM_MIRROR_ONLINE and took 1605 msecs.
Mon Sep 11 15:12:53 +1000 [Cluster1-1b: nvmm_mirror_sync: nvmm.mirror.state.change:debug]: mirror of sysid 2, partner_type DR PARTNER, changed state from NVMM_MIRROR_SYNCING_OTHER to NVMM_MIRROR_ONLINE and took 1540 msecs.
Mon Sep 11 15:12:55 +1000 [Cluster1-1b: nvmm_mirror_sync: nvmm.mirror.state.change:debug]: mirror of sysid 1, partner_type HA Partner, changed state from NVMM_MIRROR_SYNCING_OTHER to NVMM_MIRROR_ONLINE and took 1545 msecs
- 部分或全部远程镜像丛已脱机,驱动器标记为故障。
Plex /Cluster1-1a_ssd_aggr1/plex1 (offline, failed, inactive, pool1)
RAID group /Cluster1-1a_ssd_aggr1/plex1/rg0 (partial)
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
dparity FAILED N/A 3630753/ -
parity FAILED N/A 3630753/ -
data FAILED N/A 3630753/ -
data FAILED N/A 3630753/ -
data FAILED N/A 3630753/ -
data FAILED N/A 3630753/ -
data FAILED N/A 3630753/ -
data FAILED N/A 3630753/ -
data FAILED N/A 3630753/ -
data FAILED N/A 3630753/ -
data FAILED N/A 3630753/ -
Raid group is missing 11 disks.
Plex /Cluster1-1a_root/plex12 (offline, failed, inactive, pool1)
RAID group /Cluster1-1a_root/plex12/rg0 (partial)
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
dparity FAILED N/A 63849/ -
parity FAILED N/A 63849/ -
data FAILED N/A 63849/ -
data FAILED N/A 63849/ -
data FAILED N/A 63849/ -
Raid group is missing 5 disks.
站点 A:Cluster2
节点:
Cluster2-1a - 没有问题
Cluster2-1b - 没有问题
站点 B:Cluster1
节点:
Cluster1-1a ---> 所有远程磁盘都出现故障/缺失
Cluster1-1b ---> 没有问题
- 存储和交换机没有潜在的硬件问题。