SUSE Linux Enterprise Server 报告 SolidFire 卷上的延迟错误、有时会导致节点重新引导
适用于
- SUSE Linux 11
- Element OS 所有版本
问题
- 如果服务器节点读取时间超过监视程序超时时间(默认值为 5 秒)、基于存储的死机将自动隔离该服务器节点。
- 高可用性集群堆栈的最高优先级是保护数据的完整性。这是通过防止对数据存储进行不协调的并发访问来实现的
/var/log/messages 的摘录
2018-07-15T16:17:55.465250+05:30 SUSE1 kernel: [252872.728268] sd 5:0:0:0: Power-on or device reset occurred
2018-07-16T18:57:46.184465+05:30 SUSE1 kernel: [348860.316370] sd 5:0:0:0: Power-on or device reset occurred
2018-07-16T21:43:55.296466+05:30 SUSE1 kernel: [358829.428348] sd 5:0:0:0: Power-on or device reset occurred
2018-07-16T21:59:10.856469+05:30 SUSE1 kernel: [359744.988267] sd 5:0:0:0: Power-on or device reset occurred
2018-07-16T21:59:13.368470+05:30 SUSE1 kernel: [359747.500238] sd 4:0:0:0: Power-on or device reset occurred
2018-07-18T06:01:11.754661+05:30 SUSE1 sbd: [19520]: WARN: Latency: 4 exceeded threshold 3 on disk /dev/disk/by-id/scsi-<UID>
2018-07-18T17:17:58.788914+05:30 SUSE1 sbd: [2173]: WARN: Servant for /dev/disk/by-id/scsi-<UID> outdated (age: 4)
2018-07-18T17:17:58.932282+05:30 SUSE1 sbd: [2173]: WARN: Servant for /dev/disk/by-id/scsi-<UID> (pid: 19520) has terminated
2018-07-18T17:33:06.735935+05:30 SUSE1 sbd: [2173]: WARN: Servant for /dev/disk/by-id/scsi-<UID> (pid: 44815) has terminated
2018-07-18T17:33:06.736230+05:30 SUSE1 sbd: [2173]: WARN: Servant for /dev/disk/by-id/scsi-<UID> outdated (age: 4)
2018-07-19T06:01:11.611958+05:30 SUSE1 sbd: [52406]: WARN: Latency: 4 exceeded threshold 3 on disk /dev/disk/by-id/scsi-<UID>
2018-07-20T06:01:11.324452+05:30 SUSE1 sbd: [52406]: WARN: Latency: 4 exceeded threshold 3 on disk /dev/disk/by-id/scsi-<UID>