AFF A400/FAS8300/FAS8700 UMCE 崩溃,并且 Broadcom LPE16004 处于未知状态
适用场景
- AFF A400
- FAS8300
- FAS8700
问题描述
- 控制器崩溃,字符串类似于:
PANIC : Uncorrectable Machine Check Error at CPU15. SKL_IIO Error: STATUS<0xbb80000000000e0b>(VALID,UC,EN,MISCV,PCC,S,AR,CORR_ERR_STATUS(0),CORR_ERR_CNT(0),MSCOD(0),MCACOD(0xe0b))MISC<0x0000000085100000>(UCR_BUS_LOG(133),UCR_DEVICE_LOG(2),UCR_FUNCTION_LOG(0),UCR_SEGMENT_LOG(0))IIO Machine Check from device(s):RPT(133,2,0):ErrSrcID(CorrSrc(0),UCorrSrc(0x8502)), Broadcom LPE16004 on Unknown, Broadcom LPE16004 on Unknown. .
pcie.stealth.errors
在发生崩溃之前,可能会或不会在 EMS 中观察到类似于以下内容的情况:
pcie.stealth.errors: pcie_errors="IIO6: RPT(133,2,0): Broadcom LPE16004 on Unknown, Broadcom LPE16004 on Unknown, Dv[e200](135,0,0): DevStatus(Corr), CorrErr(Rcvr); Dv[e200](135,0,1): DevStatus(Corr), CorrErr(Rcvr); "