无法报告 NVMM_ERR_IO 错误并进行接管
适用场景
- FAS8300
- ONTAP 9
问题描述
- EMS 日志报告错误:
Cluster::*> event log show
Time Node Severity Event
------------------- ---------------- ------------- ---------------------------
12/3/2021 20:26:37 Cluster-02
DEBUG nvmm.mirror.aborting: mirror of sysid 1, partner_type HA Partner and mirror state NVMM_MIRROR_SYNCING_START is aborted because of reason NVMM_ERR_IO.
12/3/2021 20:26:34 Cluster-01
DEBUG rdma.rlib.connected: raid:HA:P QP is now connected.
12/3/2021 20:26:34 Cluster-02
DEBUG rdma.rlib.connected: raid:HA:A QP is now connected.
12/3/2021 20:26:30 Cluster-02
DEBUG mirror.stream.qp.error: mirror="HA Partner", qp_name="WAFL", error="NVMM_ERR_POLL_TIMEOUT"
12/3/2021 20:25:22 Cluster-02
DEBUG nvmm.mirror.state.change: mirror of sysid 1, partner_type HA Partner, changed state from NVMM_MIRROR_LAYOUT_SYNCING to NVMM_MIRROR_LAYOUT_SYNCED and took 1 msecs.
- 由于 NVRAM 日志未同步,无法执行接管。
Cluster::> storage failover show
Takeover
Node Partner Possible State Description
-------------- -------------- -------- -------------------------------------
Cluster-01
Cluster-02 false Connected to Cluster-02,
Takeover is not possible: NVRAM log
not synchronized
Cluster-02
Cluster-01 true Connected to Cluster-01
2 entries were displayed.
- 检查 HA 互连缆线状态是否不正确,端口 e0a/e0b 速度是否不正确。
两个端口均应为 25GbE
slot 0: 10G/25G Ethernet Controller CX5
e0a MAC Address: d0:39:ea:33:f8:51 (auto-25g_cr-fd-up)
SFP Vendor: Amphenol
SFP Part Number: NDCCGF-N103
SFP Serial Number: APF21149368814
e0b MAC Address: d0:39:ea:33:f8:52 (auto-10g_cr-fd-up)
SFP Vendor: Amphenol
SFP Part Number: NDCCGF-N103
SFP Serial Number: APF21149368913