由于一个 DIMM 故障,多个 DIMM 被禁用,ONTAP 以较少内存启动
适用于
- AFF A1K、AFF A90、AFF A70
- ASA A1K、ASA A90、ASA A70
- FAS90、FAS70
- AFF A900、ASA A900
- FAS9500
- 平台.reducedMemory 事件
问题描述
- "
BootDimmDisableAlert
"报告了来自system health alert show
的多个 DIMM
Node: node-01
Alert ID: BootDimmDisableAlert
Resource: DIMM-10
Severity: Major
Indication Time: XXX XXX XX XX:XX:XX XXXX
Suppress: false
Acknowledge: false
Probable Cause: "DIMM-10" has been disabled to preserve memory
interleaving. The system has booted with less memory.
Possible Effect: System memory has been reduced, which can impact
performance of the node.
Corrective Actions: Repair failed or mapped-out DIMMs in the system. Then,
reset the system to re-enable disabled DIMMs.
- ECC 错误,其他 POST 内存错误和
platform.reducedMemory
事件可以在串行控制台日志中找到。
login: ECC error at DIMM-23: 2204-0278F989,ADDR 0x6296b70340,(Node(0), memory controller(3), CH(6), DIMM(0), Rank(1), Bank Group(2), Bank(0x3), Row(0x1885e), Col(0x208)) Uncorrectable Machine Check Error at CPU29. ICL_IMC3_C0 Error: STATUS<0xbe00ffc2001000c0>(VALID,UC,EN,MISCV,ADDRV,PCC,CORR_ERR_STAT(0),CORR_ERR_CNT(0x3ff),OTHER_INFO(0x2),MSCOD(0x10),MCACOD(0xc0)),MISC<0x09000e0c42f10486>(EXTRA_ERR_INFO(0x4800706217882),ADDR_MODE(0x2),REC_ERR_LSB(0x6)),ADDR<0x0000006296b70340>(ADDRESS(0x6296b70340))Node(0), Memory controller(3), CH(0), DIMM(0), Rank(1), Bank Group(2), Bank(0x3), Row(0x1885e), Col(0x208),
MEMORY WARNING: Major Code 0x30 Minor Code 0x2D Dimm 23
MEMORY WARNING: Major Code 0x30 Minor Code 0x2D Dimm 23
MEMORY WARNING: Major Code 0x30 Minor Code 0x2D Dimm 23
MEMORY WARNING: Major Code 0x30 Minor Code 0x2D Dimm 23
MEMORY WARNING: Major Code 0x30 Minor Code 0x2D Dimm 23
MEMORY WARNING: Major Code 0x30 Minor Code 0x2D Dimm 23
MEMORY WARNING: Major Code 0x30 Minor Code 0x2D Dimm 23
MEMORY WARNING: Major Code 0x30 Minor Code 0x2D Dimm 23
MEMORY WARNING: Major Code 0x30 Minor Code 0x2D Dimm 23
DIMM:23 mapped out. BIOS MRC mapped out DIMM. Major / Minor Error Code: 0x0B / 0x1C
Complete channel mapped out.
Or
DIMM:23 mapped out. BIOS MRC mapped out DIMM. Major / Minor Error Code: 0x30 / 0x2D
Complete channel mapped out.
DIMM in slot 3 is disabled
DIMM in slot 7 is disabled
DIMM in slot 10 is disabled
DIMM in slot 14 is disabled
DIMM in slot 19 is disabled
DIMM in slot 23 failed
DIMM in slot 26 is disabled
DIMM in slot 30 is disabled
Jan 23 23:58:06 [node-01:platform.reducedMemory:ALERT]: System memory (511 GB) is less than expected (1024 GB). Check DIMMs slots 3, 7, 10, 14, 19, 23, 26, 30.