CONAP-229226:在配对节点卡住且无法重新启动时、接管/正常运行的节点上对API调用的响应较慢
问题描述
- One node (Node 1) took over the other node (Node 2)
- Node 2 was not accessible through SSH or REST API calls though node 1 saw it was still up
- Node 1, the takeover/surviving node, continued to serve user data
- Node 1 was slow to SSH and REST API calls, resulting in health check timeout/failures by FSx Control Plane
- Node 2 had to be restarted/NMI’ed from AWS Console or CLI to recover