由于缺少多个磁盘，AWS 或 GCP CVO 已重新启动

最后更新
另存为PDF

Views:: 138

Visibility:: Public

Votes:: 0

Category:: cloud-volumes-ontap-cvo

Specialty:: ds_cvo

Last Updated:

适用于

Cloud Volumes ONTAP (CVO)
Amazon Web Services (AWS)
Google Cloud Provider (GCP)

问题

AWS / GCP CVO 节点在幸存的 HA 合作伙伴发送 AutoSupport 后重新启动 HA Group Notification (MULTIPLE DISKS MISSING) ERROR 或 HA Group Notification (FILESYSTEM DISK MISSING) ERROR
CVO HA 集群中的某个节点意外接管了其合作伙伴节点。
从幸存节点的 EMS 日志中可以看出，该节点已失去对连接到故障节点的镜像 Pool1 磁盘的访问权限：

Mon Jun 03 16:23:02 +0000 [CVO-01: monitor: monitor.globalStatus.critical:EMERGENCY]: This node has taken over CVO-02. One or more mirrored aggregates are degraded.

Mon Jun 03 16:22:35 +0000 [CVO-01: dmgr_thread: raid.disk.missing:info]: Disk /aggr1/plex1/rg0/0d.10 S/N [00000000V9NeubcHXfRG] UID [00000000V9NeubcHXfRG] is missing from the system Mon Jun 03 16:22:35 +0000 [CVO-01: config_thread: raid.config.filesystem.disk.missing:info]: File system Disk /aggr1/plex1/rg0/0d.10 S/N [00000000V9NeubcHXfRG] UID [00000000V9NeubcHXfRG] is missing.

注：上述错误适用于受影响节点 CVO-02 拥有的所有磁盘。

Storage failover show 输出报告 Previous giveback failed in module: raid 如下所示：

::> storage failover show Takeover Node Partner Possible State Description -------------- -------------- -------- ------------------------------------- CVO-01 CVO-02 false Previous giveback failed in module: raid CVO-02 CVO-01 - Waiting for giveback

EMS 日志（以下错误可能会重复，直到 RAID 重新同步完成）：

Sat Jul 19 04:15:20 +0000 [CVO-01: cf_main: gb.cfo.abort.raid.fm:error]: Aggregate local:aggr8 is being resynced; canceling giveback. Sat Jul 19 04:15:20 +0000 [CVO-01: cf_main: cf.rsrc.givebackVeto:alert]: Failover monitor: raid: giveback canceled due to active state. Sat Jul 19 04:15:20 +0000 [CVO-01: cf_main: cf.fsm.autoGivebackVetoed:error]: Failover monitor: Automatic giveback has been deferred due to long running operations

此事件发生后不久，可能会生成以下 AutoSupport 警报，作为丢失磁盘的残留症状：

HA Group Notification (SYNCMIRROR PLEX FAILED) ALERT

NODEOQ：来自CVO-02的HA组通知（节点超出群集仲裁）紧急情况

在节点重新启动后，它能够重新建立与所提供的 AWS / GCP 磁盘的连接，并成功完成回馈。