节点禁用与磁盘的AFF A800多磁盘链路后出现崩溃
适用场景
问题描述
- 控制器 意外被接管 、原因如下:
Fri Oct 07 11:52:42 +0000 [node-01: cf_main: cf.fsm.takeover.mdp:alert]: Failover monitor: takeover attempted after multi-disk failure on partner
- 在受影响的节点上出现以下错误
Fri Oct 07 11:51:14 +0000 [node-01: config_failed_disk: callhome.disks.missing:error]: Call home for MULTIPLE DISKS MISSING
Fri Oct 07 11:50:12 +0000 [node-02: kernel: nvme.link.disabled.error:error]: PCIe link disabled for NVMe SSD in slot 42 due to excessive errors.
Fri Oct 07 11:51:29 +0000 [node-02: SKL cerror: pcie.stealth.errors:debug]: params: {'pcie_errors': 'IIO7: RPT(215,0,0): SecStatus(RcvMstAbt), DevStatus(Corr), CorrErr(RNRov,RpTim); PLX PCIE 9765 switch on Controller, Br[9765](216,0,0): DevStatus(Corr), CorrErr(Rcvr,BTLP,BDLLP,RNRov,RpTim), BadTLP(8766), BadDLLP(28234), RcvErr(P0(255)); '}
Fri Oct 07 11:51:41 +0000 [node-02: kernel: nvme.link.disabled.error:error]: PCIe link disabled for NVMe SSD in slot 43 due to excessive errors.
- 接管受影响节点的配对节点上的受影响磁盘未出现故障