由于分区服务无响应、导致卷脱机
适用场景
Element OS 12.5.
问题描述
- 这是一个4节点H610s SolidFire 集群、卷数较多
- 每天对集群运行数千个API调用
- 有时会报告
volumeOffline
错误、并unresponsiveService
在volumeOffline
错误前后发出警告。
1 2023-03-04T17:33:00.206Z Warning service 1 cluster_name 1 Yes 2023-03-04T17:41:47.810Z unresponsiveService A metadata service is not responding.
2 2023-03-04T17:37:22.127Z Error service 1 cluster_name 1 Yes 2023-03-04T17:39:40.148Z sliceServiceUnhealthy A metadata service is unhealthy and SolidFire is attempting to migrate data away from it.
3 2023-03-04T17:38:55.966Z Error cluster 0 0 Yes 2023-03-04T17:44:40.299Z volumesOffline The following volumes are offline. [xxxxx, xxxxx, xxxxx]
4 2023-03-04T17:38:57.584Z Warning service 2 cluster_name 25 Yes 2023-03-04T17:44:40.300Z unresponsiveService A metadata service is not responding.