不支持PPR的AFF / FAS系统出现不可更正的内存错误
适用场景
- ONTAP 9
- 平台:
- AFF A320
- AFF A300/FAS8200
- AFF A250 / AFF C250 / FAS500f
- AFF A220 / AFF C190 / FAS27x0
- AFF A200/FAS26x0
- AFF / FAS80
- FAS22x0/FAS25x0
- FAS32x0/FAS62x0
问题描述
- 控制器发生故障并重新启动、并显示DIMM错误:
PANIC: ECC error at DIMM-18: 2C-0F-2007-2664E6BE,ADDR 0x180a048b40,(Node(1), Memory controller(1), CH(3), DIMM(0), Rank(0), Bank Group(1), Bank(0x0), Row(0xb8b1), Col(0x2f8), Uncorrectable Machine Check Error at CPU21.
- EMS日志:
cf_hwassist: cf.hwassist.takeoverTrapRecv:debug]: hw_assist: Received takeover hw_assist alert from partner(node02), system_down because dimm_uecc_error.
- event all log:
ECC error at DIMM-2: 2C-0F-1910-20FE7F16,ADDR 0x27fce6000,(Node(0), Memory controller(0), CH(1), DIMM(0), Rank(0), Bank Group(0), Bank(0x0), Row(0x0), Col(0x0)), devtag(0x3f), correrr(0x0) Uncorrectable Machine Check Error at CPU9. BDWL_HA0 Error: STATUS<0xfe00000000010091>(Val,OverF,UnCor,Enable,MiscV,AddrV,PCC,CorrSts(0),CorrCnt(0),ExtErr(0x1),ErrCode(Channel 1, Read),ErrCode(0x91)),MISC<0x00000000406aea86>(HaDbBank(0),PE(0),ReqOpcode(0x2),RNID(0),RTID(0x35),HTID(0x75))
Requesting SP to power cycle the filer to attempt to clear DRAM UECC
[IPMI Event.critical]: DIMM UECC Fatal Error detected by Storage OS
[Trap Event.critical]: hwassist dimm_uecc_error (32)
[Trap Event.critical]: SNMP dimm_uecc_error (32)
[IPMI Event.critical]: System power cycle
[IPMI.notice]: 08e8 | 02 | EVT: 015000ad | P3V3 | Assertion Event, "Lower Non-critical going low " | Reading: 0.000 | Threshold: 3.027
[IPMI.notice]: 08e9 | 02 | EVT: 015200a9 | P3V3 | Assertion Event, "Lower Critical going low " | Reading: 0.000 | Threshold: 2.957
[IPMI.notice]: 08ea | 02 | EVT: 0300ffff | Power_Good | Assertion Event, "State Deasserted"
[IPMI.notice]: 08eb | 02 | EVT: 015006af | P12V | Assertion Event, "Lower Non-critical going low " | Reading: 0.372 | Threshold: 10.850
[IPMI.notice]: 08ec | 02 | EVT: 015206aa | P12V | Assertion Event, "Lower Critical going low " | Reading: 0.372 | Threshold: 10.540
[BMC.critical]: Filer Reboots