节点无响应 " 可通过 HA-IC 访问节点,但集群访问失败 "
适用场景
- ONTAP 9.9.1
- SnapMirror 业务连续性( SM-BC )
问题描述
- 双节点集群中的一个节点停止响应,并且未提供数据
- 集群命令已完成,但无法返回问题节点上聚合的正确状态:
::> aggr show
Info: Failed to get the information for aggregate aggr1. Reason: ZSM - failed, status code = 572, extra = RPC: Unable to receive [from mgwd on node "node1" (VSID -1) to kernel at 127.0.0.1], took 0.001s, max 110s [127.0.0.1:000].
::> lun show
This table is currently empty.
Warning: the LUN inventory is not available for the following volumes:
Volume "vol1" in Vserver "svm1". Reason: RPC: Unable to receive [from mgwd on node "node1" (VSID: -1) to kernel at 127.0.0.1].
- 对问题节点运行 nodeshell 命令会导致控制台停止响应
- 节点崩溃:
Node node1 is not responding and the below panic was found:
Panic String: Shutdown taking longer than 930 seconds in process nodewatchdog on release 9.9.1 (C)