A400 由于不可更正的计算机检查错误 ECC 错误而发生崩溃 在 NVDIMM 上
适用场景
- AFF A400
问题描述
- 系统崩溃,并显示以下崩溃字符串
PANIC : Uncorrectable Machine Check Error at CPU7. ECC error at NVDIMM-11: 2C-0F-1927-22865F99,ADDR 0x2081461e80,(Node(0), Memory controller(1), CH(5), DIMM(0), Rank(0), Bank Group(2), Bank(0x1), Row(0x140), Col(0x3d0)) SKL_IMC1 Error: STATUS<0xfe100000010400a2>(VALID,OVERFLOW,UC,EN,MISCV,ADDRV,PCC,CORR_ERR_STATUS(0),CORR_ERR_CNT(0x4000),MSCOD(0x104),MCCOD(0xa2))MISC<0x200814a230002086>(DataErrorChunk(0x2),McCmdChnl(0x2),McCmdMemRegion(0),McCmdOpcode(0x14),McCmdVld,SmiMsgClass(0x4),SmiOpcode(0x4),TrkId(0x180),Error_Type(0x4),ADDRMODE(0x2),ADDRLSB(0x6))ADDR<0x0000002081461e80>(HIPHYADDR(0x20),LOPHYADDR(0x205187a))(Node(0), Memory controller(1), CH(2), DIMM(0), Rank(0), Bank Group(2), Bank(0x1), Row(0x140), Col(0x3d0),