NVDIMMBadithAlert Reason: VPP Lost,在配备NVDIMM的系统上
适用场景
- ONTAP 9
- AFF A800、AFF A400、AFF A320
- FAS8700、FAS9300
- NVDIMM
- 运行状况说明:VPP丢失
问题描述
EMS日志:
Tue Apr 11 13:53:17 -0400 [NetApp1: nphmd: hm.alert.raised:alert]: Alert Id = NVDIMMBadHealthAlert , Alerting Resource = /dev/nvdimm0:NetApp1 raised by monitor controller
AutoSupport日志、我们会看到以下内容:
- 平台传感器.XML
---------------------------------------------------------------
Sensor Name Sensor Type Sensor State
--------------------------------------------------------------
NVDIMM0 VPP discrete fault
NVDIMM0 Health discrete fault
- NVDIMM状态
Total NVDIMM on this platform is 1
--------------------------------------------------
DIMM(/dev/nvdimm0) Page:0
--------------------------------------------------
0x0000: 00 0a 0a 01 01 00 21 3c 25 3c 25 34 34 00 00 00
0x0010: 1f 2e 00 00 03 15 03 0b 68 81 02 00 68 81 1e 80
0x0020: 05 80 78 80 05 00 00 40 1f ac 0d d0 07 a0 0f 90
0x0030: 33 08 08 dc 05 10 80 00 00 00 70 03 00 00 00 00
0x0040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
-----------------------------------------------------
--------------------------------------------------
DIMM(/dev/nvdimm0):
--------------------------------------------------
Controller Ready: Yes
Controller Busy: No
Energy Policy managed by: HOST
Save_N Low During CSAVE: Yes
Save_N Enabled(ARMED): Yes
Data on the Flash: NotValid
Module is Health: No
Module Status(0x0004): VPP Lost
Flash Lifetime: 96%
Flash Lifetime Status: Normal
运行状况警报:
::>system health alert show
节点 | Node01 |
---|---|
监控 | controller |
Alert ID | NVDIMMBad运行 状况警报 |
警报资源 | /dev/nvdmm0:NetApp 1 |
子系统 | 主板 |
指示时间 | 星期二四月11 13:53:17 2023 |
感知严重性 | 重大 |
可能是发生原因 | 硬件降级 |
问题描述 | 节点"NetApp 1"上的NVDIMM "NVDIMM N 0 (DIMM - 11)"指示降级状态。 |
更正操作 | 请联系技术支持以获得有关更换NVDIMM模块的帮助。 |
可能的影响 | NVDIMM降级可能会导致数据丢失。 |
确认 | false |
禁止 | false |
策略 | NVDIMMBadithPolicy |
追加信息 |
节点:NetApp 1 |
- 重新启动节点不会修复问题描述。