如何对PCI/NMI、UMCE和嵌套计算机检查异常异常发生故障进行故障排除
执行
执行
适用场景
- PCI不可屏蔽中断(NMI)发生中断
- PCI不可更正的机器检查异常(UMCE)发生紧急情况
- 非PCI不可更正的机器检查异常(UMCE)发生紧急情况
- 嵌套机器检查异常发生错误
- AFF 系统
- FAS 系统
问题描述
本文介绍如何解决以下类型的崩溃问题:
- PCI/NMI
PANIC: PCI Error NMI from device(s):PCI Device 111d:806c in slot 2 on Controller, Qlogic FC 8G adapter in slot 2 on Controller, Qlogic FC 8G adapter in slot 2 on Controller. in process idle on release 8.3 (C) on Fri Sep 18 13:27:47 MDT 2015
- PCI UMCE
Uncorrectable Machine Check Error at CPU0. MC5 Error: STATUS<0xb200001084200e0f>(Val,UnCor,Enable,PCC,ErrCode(Gen,NTO,Gen,Gen,Gen)); Root Port(0,6,0): Status(SigSysErr), SecStatus(RcvMstAbt), DevStatus(NFatal), RootErr(UCor,NFatal), ErrSrcID(CorrSrc(0),UCorrSrc(0x20)), UCorrErr(CpTim), FirstUCorrErr(CpTim), Hdr[0](HdrLen(4),AddrType(0),Attr(0),Tc(0),Type(0),Format(3)), Hdr[1]((0xe0304ff)), Hdr[2]((0x1)), Hdr[3]((0x1bb93710)). in process pci_poll on release 8.1.2P3
- 非PCI UMCE
Uncorrectable Machine Check Error at CPU0. MC0 Error: STATUS<0xb200000430000800>(Val,UnCor,Enable,PCC,ErrCode(Src,NTO,Gen,Mem,L0)). MC5 Error: STATUS<0xf2000010c4300e0f>(Val,OverF,UnCor,Enable,PCC,ErrCode(Gen,NTO,Gen,Gen,Gen)); Uncorrectable error at DIMM-1, Channel 0, Serial: BA-00-1131-00098398!69002460-I01-NTA-T1?!, FERR(0x400), NERR(0x402), MERR M10Err, Rank 3, Bank 6, CAS 0x1e8, RAS 0x1bcf Uncorrectable error at DIMM-1, Channel 0, Serial: BA-00-1131-00098398!69002460-I01-NTA-T1?!, MERR M10Err, Rank 3, Bank 6, CAS 0x1e8, RAS 0x1bc.
- 嵌套机器检查
PANIC: nested machine check exception detected on CPU #, no coredump will be generated.