跳转到主内容

AFF A700s CECC :针对错误的 DIMM 报告可更正的计算机检查错误

Views:
5
Visibility:
Public
Votes:
0
Category:
aff-series
Specialty:
hw
Last Updated:

适用于

  • AFF A700
  • 全闪存 FAS

问题

即使在更换之后,也会在同一 DIMM 中报告 CECC 错误:

system health alert show该命令报告的错误与集群上的以下类似:

Node           xxxxxx
Monitor         controller
Alert ID         CriticalCECCCountMemErrAlert
Alerting Resource    DIMM-x
Subsystem        Memory
Indication Time     Tue Oct 09 12:24:36 2018
Perceived Severity    Critical
Probable Cause      DIMM_Degraded
Description       The DIMM has degraded, leading to memory errors.

The following are corrective actions:

1. Contact technical support to obtain a new DIMM of the same specification
2. If possible, perform a takeover of this node and bring the node down for maintenance
3. Refer to the DIMM replacement guide for your given hardware platform to replace the DIMM
4. Bring the storage system online

Possible Effect:
Memory issues can lead to a catastrophic system panic, which can lead to data downtime on the node.


EMS 日志显示类似于以下内容的消息:报告特定 DIMM 上的 CECC 错误:

[?] Tue Oct 09 12:24:36 IST [xxxx: mgwd: callhome.hm.alert.critical:alert]: Call home for Health Monitor process nphm: CriticalCECCCountMemErrAlert[DIMM-x].

通常建议更换此 DIMM 。
但是,即使在更换之后、集群也可能会报告同一 DIMM 中的错误。

 

 

 

CUSTOMER EXCLUSIVE CONTENT

Registered NetApp customers get unlimited access to our dynamic Knowledge Base.

New authoritative content is published and updated each day by our team of experts.

Current Customer or Partner?

Sign In for unlimited access

New to NetApp?

Learn more about our award-winning Support