SyncMirror Plex 失败 - AutoSupport 消息
适用于
- MetroCluster
- ONTAP 9
- SyncMirror
事件摘要
此 AutoSupport 消息 SYNCMIRROR PLEX FAILED 指示 SyncMirror 的丛发生故障,并且 SyncMirror 关系处于降级状态。
验证
确定哪些丛报告为失败:
storage aggregate show
Aggregate Size Available Used% State #Vols Nodes RAID Status
--------- -------- --------- ----- ------- ------ ---------------- ------------
aggr0_siteA_02
953.8GB 46.22GB 95% online 1 siteA-02 raid_dp,
mirrored,
normal
aggr1_siteA_02
2.79TB 2.78TB 0% online 2 siteA-02 raid_dp,
mirror
degraded
解决方法
- 确定聚合和故障丛:
siteA::>storage aggregate show
Aggregate Size Available Used% State #Vols Nodes RAID Status
--------- -------- --------- ----- ------- ------ ---------------- ------------
aggr0_siteA_02
953.8GB 46.22GB 95% online 1 siteA-02 raid_dp,
mirrored,
normal
aggr1_siteA_02
2.79TB 2.78TB 0% online 2 siteA-02 raid_dp,
mirror
degraded
状态为 degraded 表示丛无法操作。
- 确定 plex 故障的原因:
- 磁盘
- 磁盘架
- 交换机 ISL 故障
- 站点故障
SiteA::>storage aggregate show-status -aggregate aggr2_siteA_02
Owner Node: SiteA-02
Aggregate: aggr2_siteA_02 (online, raid_dp, mirror degraded) (block checksums)
Plex: /aggr2_siteA_02/plex0 (online, normal, active, pool0)
RAID Group /aggr2_siteA_02/plex0/rg0 (normal, block checksums)
Usable Physical
Position Disk Pool Type RPM Size Size Status
-------- --------------------------- ---- ----- ------ -------- -------- ----------
dparity 3.11.4 0 SAS 10000 1.09TB 1.09TB (normal)
parity 3.11.5 0 SAS 10000 1.09TB 1.09TB (normal)
data 3.11.6 0 SAS 10000 1.09TB 1.09TB (normal)
data 3.11.18 0 SAS 10000 1.09TB 1.09TB (normal)
data 3.11.19 0 SAS 10000 1.09TB 1.09TB (normal)
Plex: /aggr2_siteA_02/plex1 (offline, failed, inactive, pool1)
RAID Group /aggr2_siteA_02/plex1/rg0 (partial, none checksums)
Usable Physical
Position Disk Pool Type RPM Size Size Status
-------- --------------------------- ---- ----- ------ -------- -------- ----------
dparity FAILED - - - 1.09TB 0B (failed)
parity FAILED - - - 1.09TB 0B (failed)
data FAILED - - - 1.09TB 0B (failed)
data FAILED - - - 1.09TB 0B (failed)
data FAILED - - - 1.09TB 0B (failed)
解决原因可能会使磁盘恢复正常运行,使 plex 恢复到可操作状态。如果 plex 恢复到操作状态,则应自动启动重新同步过程。在这种情况下,无需执行进一步操作。您可以使用以下方式监控重新同步过程:
SiteA:>storage aggregate plex show
- 如果无法纠正原因 - 例如真正的磁盘(硬件)故障、站点故障、电涌或类似故障,并且无法使足够的磁盘恢复服务以使 plex 联机,则 plex 无法修复,但必须销毁和重新创建。
注意:销毁和重建 plex 需要进行完整的镜像基线。确保池中有足够的备件来重新创建镜像。
要销毁并重新创建镜像,请执行以下步骤:
storage aggregate plex delete -aggregate <aggr_name> -plex <degraded_plex_name>storage aggregate mirror -aggregate <aggr_name>
追加信息
如果您需要有关 plex 故障排除的帮助或任何其他帮助,请联系NetApp 技术支持。